GPT-5.2 Slashes Hallucinations by 40%, New Data Shows

OpenAI's latest documentation reveals GPT-5.2 dramatically cuts factual errors compared to earlier versions, marking a major leap in AI reliability.

⬤ OpenAI's newest system card shows GPT-5.2 makes far fewer mistakes than previous models. The update reveals hallucinations dropped by roughly 30 to 40 percent—one of the biggest improvements in factual accuracy across the GPT-5 lineup. This comparison is based on testing across thousands of prompts designed to mirror real ChatGPT conversations.

⬤ The numbers tell a clear story. GPT-5 Thinking produced incorrect claims in about 1.1 percent of cases, GPT-5.1 Thinking hit 1.5 percent, while GPT-5.2 Thinking dropped to just 0.8 percent. The gap gets even wider when looking at responses with at least one major error: GPT-5 Thinking showed 7.3 percent, GPT-5.1 Thinking 8.8 percent, and GPT-5.2 Thinking came in at 5.8 percent.

⬤ OpenAI tested this using an AI grading model with web access to catch factual mistakes in generated responses. This approach allows direct comparison between model versions using real-world prompts. The company says GPT-5.2 matches or beats its predecessors, proving that design and training improvements are paying off.

⬤ Better accuracy matters tremendously for AI's commercial future. Factual reliability remains the biggest concern for businesses considering AI adoption at scale. As models like GPT-5.2 prove they can deliver consistent, trustworthy results, they become practical for complex enterprise workflows—pushing the entire industry toward wider deployment of advanced AI systems.

News Source

#AI News #ChatGPT News #GPT-5.2

Marina Lyubimova E-mail

Marina Lyubimova - editor and writer at Aigazine.com, blending years of financial journalism with a growing focus on the world of AI and innovation.