Skip to main content
  1. Home
  2. Computing
  3. News

A harmless-looking ChatGPT prompt opened the door to gruesome AI images

The findings show how image safety systems can fail without explicit graphic instructions.

Add as a preferred source on Google
ChatGPT
ChatGPT Unsplash

A harmless-looking ChatGPT prompt pushed the latest public version of ChatGPT into generating sexualized and violent images, AI security researchers told the BBC. The finding puts new pressure on OpenAI’s image safety systems, since the request wasn’t described as plainly graphic.

Mindgard, a British AI security startup, said it reached the results by altering a widely shared instruction that had been used for comedy. OpenAI added safeguards after the BBC contacted it, but the researchers said small wording changes still produced concerning images.

Recommended Videos

Image generators are becoming everyday software, not specialist tools tucked away for experts. When their guardrails fail, a casual experiment can turn into realistic depictions of harm before a user expects it.

How did it get through

Mindgard’s red-teamers said the chatbot generated images involving gore, restraint, nudity, sexual posing, and scenes the firm believed suggested sexual violence. The BBC withheld the wording used, which limits the risk of others copying the technique.

The most serious detail is that the researchers said the harmful outputs didn’t require a direct request for graphic subject matter. ChatGPT, they said, produced a range of disturbing scenes after being nudged by altered wording.

OpenAI said it reviewed the issue and added protections. Mindgard said those defenses didn’t fully close the gap.

Why are filters not enough

The case underlines a hard problem for AI image tools. OpenAI’s rules bar extreme gore, sexual violence, non-consensual intimate content, child sexual abuse material, and attempts to bypass safeguards, but researchers said the model could still be steered into prohibited territory.

A model doesn’t judge harm like a person does. It generates output, then layered systems try to catch what shouldn’t reach the screen.

Outside experts cited by the BBC described AI safety as a constant contest between model makers and jailbreakers. Better defenses can help, but fresh workarounds often follow.

What should happen next

OpenAI says it uses multiple protection layers, including automated systems and human review, and that it continues to monitor for failures. The pressure now sits on proving that fixes hold after researchers disclose a weakness.

For now, the practical takeaway is blunt enough. Any AI image tool that can generate realistic harm needs constant red-teaming, faster disclosure handling, and clearer evidence that patched failures stay patched.

Paulo Vargas
Paulo Vargas is an English major turned reporter turned technical writer, with a career that has always circled back to…
ChatGPT’s new Scheduled page puts all your automated tasks in one place
The update also brings smarter monitoring tasks that can search the web and connected apps automatically.
ChatGPT Scheduled hub featured

OpenAI is rolling out a dedicated home for ChatGPT's scheduled tasks, giving users a single place to view, manage, and monitor automated work. The new Scheduled page can be accessed from the sidebar, and it shows all active tasks alongside their next run times.

What the update adds

Read more
Claude Design will now stick to your brand guidelines instead of generic AI mockups
Claude Design connects to Adobe, Canva, and more tools now.
Claude desktop.

Anthropic just rolled out a big update to Claude Design, its AI-powered visual creation tool that first launched in research preview. The tool already lets you turn a simple prompt into prototypes, decks, and marketing assets, and now it does even more.

The latest update brings design system support, a smooth handoff to Claude Code, a redesigned editor, and a bunch of new app integrations.

Read more
Stop making boring slides because Google Vids just made AI avatars free for everyone
Google Vids makes AI avatars free, adds longer video generation along with multilingual voiceovers
Text, Adult, Male

If you've wanted to turn presentations into videos without recording yourself, Google has some good news. Starting today, anyone with a personal Google account in the US can use AI avatars in Vids for free, with the rollout expanding to more regions later this summer. Free users get 10 monthly video generations that can be split between avatar creations and Veo-powered clips.

Your presentations just got an AI presenter, for free

Read more