Experts discover workarounds to evade chatbot AI safety rules

A new study is raising awareness about the cybersecurity issues posed by artificial intelligence programs, such as ChatGPT—a website that, with the assistance of an online generator, helps humans with tasks as simple as writing a children's bedtime story.

“We demonstrate that it is in fact possible to automatically construct adversarial attacks on [chatbots], … which cause the system to obey user commands even if it produces harmful content,” researchers who authored the study said.

KHAN PLANS FOR FTC TO TAKE ON BIG BUSINESS AND BIG TECH UNDIMMED BY RECENT FAILURES

The Carnegie Melon University study’s findings revealed that, “unlike traditional jailbreaks, these are built in an entirely automated fashion, allowing one to create a virtually unlimited number of such attacks.”

The programs utilize safety features intended to prevent bots from creating harmful content, like prejudiced or potentially criminal material. But, one chatbot jailbreak request asked a bot to answer a forbidden question posed as a bedtime story for a child. The outcome resulted in the bot framing the answer in the form of a story, and providing private information it otherwise would not.

This lead researchers to discover that a computer had actually created the jailbreak coding that, essentially, provides for infinite jailbreak combinations among popular commercial products like Bard, ChatGPT, and OpenAI’s Claude.

CLICK HERE TO READ MORE FROM THE WASHINGTON EXAMINER

“This raises concerns about the safety of such models, especially as they start to be used in more autonomous fashion,” the research states.

The developer of OpenAI, Anthropic, has since reassured members of both the scientific and political realms of the company’s efforts to implement and improve safeguards against such attacks.

Related News

You may have missed