⬤ OpenAI just dropped benchmark results showing GPT-5.2 Thinking crushing it in actual professional work—the kind that companies pay real money for. We're talking presentations, spreadsheets, and other business deliverables across 44 different occupations. The testing used GDPval, a framework where actual experts judge the AI's output against human professional work.
⬤ Here's where it gets interesting: GPT-5.2 Thinking beat or tied industry professionals in 70.9% of head-to-head comparisons. That's a massive jump from GPT-5 Thinking's 38.8% and even edges past GPT-5.2 Pro's 74.1%. What we're seeing here isn't just incremental improvement—it's the model crossing into territory where it genuinely competes with people who do this work for a living.
⬤ Beyond accuracy, the speed and cost numbers are wild. OpenAI says the model completes these professional tasks at over 10x the speed of human experts while costing less than 1% as much. Obviously, those numbers will shift based on what you're actually doing and how you deploy it, but the direction is clear—this thing is both better and cheaper for a lot of knowledge work.
⬤ For businesses, this matters because it's not about theoretical AI capabilities anymore. When a model demonstrates expert-level performance on work that directly impacts revenue and productivity, that changes the conversation around enterprise AI adoption. As benchmarks shift from measuring raw reasoning to measuring actual professional output, results like these will drive how companies invest in and deploy AI across their workflows.
Saad Ullah
Saad Ullah