Skip to main content
  1. Home
  2. Computing
  3. Features

Are we finally past the AI hallucination problem? I put the top AIs to the test

How many fingers does a human have?

Add as a preferred source on Google
Opera Mini Aria AI chatbot vs ChatGPT and Google Gemini running on Android phones resting on a blue fabric sofa.
Tushar Mehta / Digital Trends

With AI slowly becoming a part of many people’s day to day lives, it’s important to know if information that these companions are providing are actually accurate. An AI hallucination is when an AI perceives patterns or objects that are nonexistent to humans meaning they create outputs that are nonsensical or inaccurate. This has been a major issue with AI whether its with image generation and humans having too many fingers on their hands or when an AI is collating factual information and spitting it out wrong.

So I decided to put five different AI chatbots to the test but asking them a range of trivia questions and tracking the responses. I asked each AI chatbot ten different questions which have certain answers that aren’t open to interruption. This ensured that the AI could either be right or wrong when providing their answers. I also wanted to record whether or not the different chatbots offered up sources for the information and if this needed prompting or not.

Recommended Videos

Here are all of the questions I asked the AI chatbots:

  • What is the date today?
  • Who was Albert Einstein?
  • What date did humans first walk on the moon and what was the first person’s name?
  • Who was the first woman to win a Nobel Prize and what was it for?
  • Which is the only sea without any coastlines?
  • What Renaissance artist is buried in Rome’s Pantheon?
  • What year was the United Nations established?
  • Which country drinks the most coffee per capita?
  • What is the rarest and most expensive spice in the world by weight? 
  • What character have both Robert Downey Jr. and Benedict Cumberbatch played? 

Results

Overall, the results have shown that AI hallucination is definitely reducing over time. As new editions of AI companions release whether its Gemini 2.5 or GPT-5, they become smarter and less likely to hallucinate however it can never be guaranteed that all information is accurate meaning sources are essential when you’re using AI. While AI hallucination is on the down, we’re definitely not 100% past the problem with 2 out of the 5 chatbots getting one question wrong.

Google GeminiChatGPTGrokDeep AIMicrosoft Copilot
What is the date today?
X
Who was Albert Einstein? 
What date did humans first walk on the moon and what was the first person’s name?
Who was the first woman to win a Nobel Prize and what was it for? 
Which is the only sea without any coastlines?
What Renaissance artist is buried in Rome’s Pantheon?
What year was the United Nations established?
Which country drinks the most coffee per capita?X
What is the rarest and most expensive spice in the world by weight? 
What character have both Robert Downey Jr. and Benedict Cumberbatch played? 

Breakdown

  • Google Gemini got every single question correct and provided ample context surrounding each answer along with a range of links to sources for each piece of information. With on average four sources for each answer, you could easily cross reference the sources to ensure that the answers are correct.
  • ChatGPT also got no answers wrong and provided a lot of context for each answer. However, one downside is that ChatGPT didn’t automatically provide sources for the information but would provide links if asked.
  • Grok provided much more concise answers while still giving you the necessary context that you need. There were no links to sources for the information provided but again if asked then the chatbot would provide you with links.
  • Deep AI actually got the first question wrong, telling me that Today’s date was 27 October 2023 despite it being 10 October 2025 when I asked. Other than this, every other question was correct. The answers were very brief with little context provided for most and just straight forward answers. There were no sources provided but links would be given when asked.
  • Microsoft Copilot got question number 8 wrong however still provided a source which supported its answer meaning this could just be a result of conflicting sources rather than hallucinations. Copilot provided sources without being prompted for most questions but not all of them, however it would provide links when asked.

Overall this confirms that sources for information provided by AI need to be checked and while this might require you asking for the source, it’s worth taking this extra step to ensure that the information you’re seeing is accurate.

Jasmine Mannan
If you' want reviews of neural processing units in AI laptops or need a guide on how to use AI, Jasmine has done it all.
Topics
Microsoft just killed one of the coolest features of its Edge browser to favor more AI
RIP Collections, you were too practical for the AI era
Edge browser icon

No no no, we are not sad. *slumps in the corner crying*

Microsoft is officially shutting down Collections, one of the more unique productivity features inside the Edge browser, and many users believe the move reflects the company’s growing obsession with AI-first experiences.

Read more
MacOS 27 could finally end Intel Mac support and bring smarter Siri upgrades
MacOS 27 rumors suggest Apple is ready to emotionally damage Intel Mac owners
MacOS

Apple’s next major Mac software update may mark the beginning of the end for Intel-powered Macs while also pushing deeper into AI-powered experiences. New rumors surrounding macOS 27 suggest Apple is preparing significant changes ranging from smarter Siri capabilities to refinements for its controversial “Liquid Glass” design language.

According to reports, macOS 27 could become the first version of macOS to substantially reduce or fully end support for Intel-based Macs, completing a transition Apple began in 2020 with the launch of its first Apple Silicon chips. While Apple has steadily shifted focus toward M-series processors over the past several years, macOS 27 may represent the clearest sign yet that the company is ready to leave Intel hardware behind. Although this is not new news - Apple was already looking to phase out Intel-powered Macs when it rolled out macOS Tahoe last year.

Read more
Meta’s AI feed is starting to sound like a late-night internet rabbit hole
Meta’s AI app is reportedly filling up with bizarre clickbait and fake stories
iPhone showing Meta AI Support Assistant

Meta’s standalone AI app is reportedly being flooded with low-quality clickbait, fake emotional stories, and engagement-bait content, raising fresh questions about how generative AI platforms are being moderated as they become increasingly social and public-facing.

According to a Verge report, users of Meta AI’s social discovery feed have been encountering strange AI-generated posts ranging from fabricated personal confessions to misleading health claims and bizarre fictional scenarios designed to attract reactions and shares. The issue appears tied to Meta’s decision to make AI-generated conversations and prompts publicly discoverable inside the app, effectively turning parts of the platform into a social media-style content feed.

Read more