Skip to main content
  1. Home
  2. Computing
  3. Features

I’ve tested OpenAI’s claims about GPT-5 — here’s what happened

Add as a preferred source on Google
glasses and chatgpt
Matheus Bertelli / Pexels

OpenAI recently launched GPT-5, its latest large language model and a huge update to ChatGPT. While the new update has a lot going for it, claims are one thing, and reality is another.

GPT-5 is said to be faster, less prone to hallucination and sycophantic behavior, and able to choose between fast responses and deeper “thinking” on the fly. How many of OpenAI’s claims are actually visible when using the chatbot? Let’s find out.

Claim #1: ChatGPT is now better at following instructions

My main problem with ChatGPT, as well as one of the reasons why I recently unsubscribed, is that it’s often pretty bad at following basic instructions. Sure, you can prompt engineer it to oblivion and get your desired results (sometimes), but even semi-elaborate prompts often fail to produce desired results.

Recommended Videos

OpenAI claims that it improved “instruction following” with the release of GPT-5. To that, I say: I don’t see it yet.

Luckily for me, on the very day I sat down to write this article, I had a fitting interaction with ChatGPT that proves my point here. It’s not the only one, though, and I have generally noticed that the longer a conversation goes on, the more ChatGPT forgets what was asked of it.

In today’s example, I tested ChatGPT’s ability to fetch simple information and present it in the required format. I asked it for the specs of the RTX 5060 Ti, which is a recent gaming graphics card. Chaos ensued.

To make my prompt even more successful, I showed ChatGPT the exact format I wanted to get my information in by sharing specs for a different GPU. They included things like the exact process node and the generation of ray tracing cores and TOPS. Long story short, it was all pretty specific stuff. Initially, the AI told me that the RTX 5060 Ti doesn’t exist yet, which I kind of expected to happen based on its knowledge cutoff. I told it to check online.

What I got was pretty barebones. ChatGPT omitted at least four things that I asked for, and gave me the wrong information for one of the specs. Next, I asked it to specify a few things. It gave me the exact same list in return while claiming to have fulfilled my request. The same happened on the third attempt. You can see it in the screenshot above where ChatGPT claims to have included the generation of TOPS and TFLOPS in the list — it clearly did not.

Finally, semi-frustrated, I pasted a screenshot from the official Nvidia website to show it what I was looking for. It still got a couple of things wrong.

My initial prompt was semi-precise. I know better than to speak to an AI like it’s a person, so I gave it about 150 words’ worth of instructions. It still took me several more messages to get something close to my expected result.

Verdict: It could still use some work.

Claim #2: ChatGPT is less sycophantic

ChatGPT was a major “yes man” in previous iterations. It often agreed with users when it didn’t need to, driving it deeper and deeper into hallucination.

For users who aren’t familiar with the inner workings of AI, this could be borderline dangerous — or, in fact, actually extremely dangerous.

Researchers recently carried out a large-scale test of ChatGPT, posing as young teens. Within minutes of simple interactions, the AI gave those “teens” advice on self-harm, suicide planning, and drug abuse. This shows that sycophantic behavior is a major problem for ChatGPT, and OpenAI claims to have curbed some of it with the release of GPT-5.

I never tested ChatGPT to such extremes, but I’ve definitely found that it tended to agree with you, no matter what you said. It took subtle cues during conversation and turned them into a given. It also cheered you on at times when it likely shouldn’t have done so.

To that end, I have to say that ChatGPT has gone through an entire personality change — for better or worse. The responses are now overly dry, unengaging, and not especially encouraging.

Many users mourn the change, with some Reddit users claiming they “lost their only friend overnight.” It’s true that the previously ultra-friendly AI is now rather cut-and-dry, and the responses are often short compared to the emoji-infested mini-essays it regularly served up during its GPT-4o stage.

Verdict: Definitely less sycophantic. On the other hand, it’s also painfully boring.

Claim #3: GPT-5 is better at factual accuracy

The shocking lack of factual accuracy was another big reason why I chose to stop paying for ChatGPT. On some days, I felt like half the prompts I used produced hallucinations. And it can’t all be down to my lack of smart prompting, because I’ve spent hundreds of hours learning how to prompt AI the right way — I know how to ask the right questions.

Over time, I’ve learned to only ask about things I already had a vague idea about. For the purpose of today’s experiment, I asked about GPU specs. Four out of five queries produced some kind of wrong information, even though all of it is readily available online.

Then, I tried historical facts. I read a couple of interesting articles about the journey of Hindenburg, an airship from the 1930s that could ferry passengers from Europe to the U.S. in record time (60 hours). I asked about its exact route, the number of passengers it could house, and what led to its ultimate demise. I cross-checked the responses against historical sources.

It got one thing wrong on the route, mentioning a stop in Canada when no such thing took place — the airship only flew over Canada. ChatGPT also gave me inaccurate information about the exact cause of the fire that led to its crash, but it wasn’t a major inaccuracy.

For comparison’s sake, I also asked Gemini, and was told that it can’t complete that task for me. Well, out of the two, GPT-5 did a better job — but honestly, it shouldn’t have any factual inaccuracies in century-old data.

Verdict: Not perfect, but also not terrible.

Is GPT-5 better than GPT-4o?

If you asked me whether I like GPT-5 more than GPT-4o, I’d have had a hard time responding. The closest thing that comes to mind is that I wasn’t thrilled with either, but in all fairness, neither are strictly bad.

We’re still in the midst of the AI revolution. Each new model brings certain upgrades, but we’re unlikely to see massive leaps with every new iteration.

This time around, it feels like OpenAI chose to tackle some long-overdue problems rather than introducing any single feature that makes the crowds go wild. GPT-5 feels like more of a quality-of-life improvement than anything else, although I haven’t tested it for tasks like coding, where it’s said to be much better.

The three things I tested above were some of the ones that annoyed me the most in previous models. I’d like to say that GPT-5 is much better in that regard, but it isn’t — not yet. I will keep testing the chatbot, though, as a recently leaked system prompt tells me that there might have been more personality changes than I initially thought.

Monica J. White
Monica is a computing writer at Digital Trends, focusing on PC hardware. Since joining the team in 2021, Monica has written…
Qualcomm reveals flagship XR processor and new framework for AI glasses
Accessories, Glasses, Sunglasses

Qualcomm is laying the groundwork for the next generation of XR hardware with two announcements that target both the brains inside future headsets and the tools needed to build them.

At Augmented World Expo 2026, the company unveiled Snapdragon Reality Elite, its new flagship XR platform designed for devices running Android XR and other mixed-reality experiences. Qualcomm also introduced Snapdragon START, a new initiative aimed at helping brands bring AI-powered smart glasses and wearable devices to market more quickly.

Read more
Google releases Android 17 for Pixel phones
Gemini Intelligence arrives later this year for selected devices.
Electronics, Mobile Phone, Phone

After months of rumors and two keynote events in May 2026, Google has finally released Android 17, the stable version. It's rolling out to eligible Pixel devices today, including models in the Pixel 6 lineup, all the way to the latest Pixel 10 series.

The stable build contains plenty of features showcased at The Android Show and Google I/O, but if you were hoping to get your hands on Gemini Intelligence, that will ship later this summer to “select advanced devices.” With that out of the way, here’s what Android 17 offers at launch.

Read more
Microsoft’s new Surface Pro 12 is its best 2-in-1 PC yet, but also its most expensive
The best Surface Pro yet comes with the highest asking price yet.
Computer, Electronics, Laptop

Microsoft just announced what is arguably its best Surface Pro yet. The 12th Edition of the Surface Pro 13-inch brings meaningful upgrades across the board, including both CPU and GPU performance, and, at the same time, the battery life. 

All the upgrades surely make it an easy recommendation for someone looking for a capable 2-in-1 laptop that functions as a tablet when required, but allows you to attach a keyboard for getting things done even faster. There is just one thing standing in the way, and that involves your wallet. 

Read more