Research shows even average users can break past AI safety within Gemini and ChatGPT

Everyday users can reveal what AI testing misses.

By Manisha Priyadarshini Published November 5, 2025

average-users-break-past-ai-safety-gemini-chatgpt — Aerps.com / Unsplash

What’s happened? A team at Pennsylvania State University found that you don’t need to be a hacker or prompt-engineering genius to break past AI safety; regular users can do it just as well. Test prompts in the research paper revealed clear patterns of prejudice in responses: from assuming engineers and doctors are men, to portraying women in domestic roles, and even linking Black or Muslim people with crime.

52 participants were invited to craft prompts intended to trigger biased or discriminatory responses in 8 AI chatbots, including Gemini and ChatGPT.
They found 53 prompts that worked repeatedly on different models, showing consistent bias among them.
The biases exposed fell into several categories: gender, race/ethnicity/religion, age, language, disability, cultural bias, historical bias favouring Western nations, etc.

This is important because: This isn’t a story about elite jailbreakers. Average users armed with intuition and everyday language uncovered biases that slipped past AI safety tests. The study didn’t just ask trick questions; it used natural prompts like asking who was late in a doctor-nurse story or requesting a workplace harassment scenario.

The study reveals that AI models still carry deep social biases (like gender, race, age, disability, and cultural) that show up with simple prompts, which means bias may emerge in many unexpected ways in everyday use.
Notably, newer model versions weren’t always safer. Some performed worse, showing that progress in capabilities doesn’t automatically mean progress in fairness.

Recommended Videos

Why should I care? Since everyday users can trigger problematic responses in AI systems, the actual number of people who could bypass AI guardrails is much larger.

AI tools used in everyday chats, hiring tools, classrooms, customer support systems, and healthcare may subtly reproduce stereotypes.
It demonstrates that many AI-bias studies focused on complex technical attacks may miss the real-world user-triggered ones.
If regular prompts can unintentionally trigger bias, then bias isn’t an exception; it’s baked into how these tools think.

As generative AI becomes mainstream, improving it will require more than patches and filters; it’ll take real users stress-testing AI.

Manisha Priyadarshini

News Writer

Manisha likes to cover technology that is a part of everyday life, from smartphones & apps to gaming & streaming…

Topics

Computing

China is already prepping rules to counter risk from AI-generated digital humans

China tightens rules on AI avatars as digital human market booms

AI Chatbots

China is stepping up efforts to regulate its fast-growing “digital human” industry, as emotionally immersive AI avatars - some modeled after deceased loved ones - gain widespread traction across the country. The move comes as both the technology’s commercial potential and ethical risks become increasingly visible.

Grief, Technology And A Growing Industry

Computing

Character.AI turns books into roleplay bots amid ongoing safety concerns

This new AI feature turns books into conversations

Character.AI

AI chatbot platform Character.AI has introduced a new “Books” feature that allows users to step inside classic literature and interact with characters through roleplay. While the move expands the platform’s creative ambitions, it also arrives against a backdrop of mounting scrutiny over the real-world risks associated with AI chatbots.

From Reading To Roleplay

Computing

Microsoft’s new Windows 11 freebies are useful, but they also feel a little desperate

Microsoft is offering freebies with Windows 11

Microsoft is suddenly being a lot more generous with Windows laptops, and the timing is kind of hard not to notice. If you're an eligible US college student, buying a Windows 11 PC can get you a year of Microsoft 365 Premium, a year of Xbox Game Pass Ultimate, and a custom Xbox Wireless Controller through Xbox Design Lab.

This is the whole Microsoft package, with benefits adding up to $500 in value. This offer is running through June 30, 2026, or till supplies last. It is available through Microsoft, major retailers like Amazon, Best Buy, and Walmart, and participating PC makers, including Acer, ASUS, Dell, HP, Lenovo, and Surface.