⬤
xAI's Grok 4.20-beta1 has climbed to #2 on the Search Arena leaderboard with a score of 1225 plus or minus 8 -- a result that puts it squarely in the same tier as the most capable AI search systems available today. The jump signals how fast Grok is closing the gap on established models.
⬤ At the top of the board sits Claude Opus 4.6 Search with 1255 plus or minus 10, followed by Grok 4.20-beta1, then GPT-5.2 Search at 1219 plus or minus 6. Gemini 3 Flash Grounding and Gemini 3 Pro Grounding also appear in the top tier -- a sign of how closely the leading systems have converged, as detailed in the recent report on Grok 4.20 Beta topping Search Arena.
Even marked preliminary, Grok 4.20-beta1 is already beating several well-known AI systems in search-focused testing.
⬤ The leaderboard entry is still labeled "preliminary," meaning more votes could shift the ranking. Still, the current result places Grok ahead of several well-established models. Similar momentum is visible elsewhere -- an 8B model recently hit 94.5% using the PaCore framework, outperforming systems many times its size.
⬤ Leaderboards like Search Arena have become a standard gauge for comparing reasoning, retrieval, and real-time knowledge across systems. Grok 4.20-beta1's rise fits a broader pattern of rapid capability growth -- one also on display when Claude Opus 4.6 solved Knuth's math problem after weeks of failed human attempts.
Saad Ullah
Saad Ullah