Using ASCII Art to Work Around Content Restrictions in the Top 5 AI Chatbots

Dan Goodin, reporting for Ars Technica:

Researchers have discovered a new way to hack AI assistants that uses a surprisingly old-school method: ASCII art. It turns out that chat-based large language models such as GPT-4 get so distracted trying to process these representations that they forget to enforce rules blocking harmful responses, such as those providing instructions for building bombs.

Such a silly trick, but it epitomizes the state of LLMs. It’s simultaneously impressive that they’re smart enough to read ASCII art, but laughable that they’re so naive that this trick works.

Sunday, 17 March 2024