xAI's Grok Voice Agent Scores 92.3% on Big Bench Audio Rankings

xAI just dropped Grok Voice Agent, their first public speech-to-speech API, and it's already crushing the competition with the highest score ever recorded on Big Bench Audio. The model beat out heavy hitters from Google and OpenAI in speech reasoning performance.

⬤ xAI just claimed the top spot in speech-to-speech AI with Grok Voice Agent's release. According to benchmark data from Artificial Analysis, the new model hit 92.3% on Big Bench Audio, edging past Google's Gemini 2.5 Flash Native Audio Thinking. This is xAI's first public speech-to-speech API and shows they're ready to go head-to-head with established voice AI platforms.

⬤ Big Bench Audio tests how well speech models can actually reason through complex questions. It uses 1,000 audio questions pulled from Big Bench Hard, a well-known text benchmark for advanced reasoning. Grok Voice Agent sits at the top of the leaderboard, beating models like Gemini 2.5 Flash Native Audio, Nova 2.0 Sonic, GPT Realtime variants, and Qwen Omni models. The results show it's really good at understanding tricky spoken prompts and delivering accurate responses.

⬤ Speed matters too, and Grok Voice Agent delivers with an average time to first token of 0.78 seconds—making it the third fastest on the board behind two Gemini 2.5 Flash variants. Pricing is straightforward at $0.05 per minute ($3/hour) of connected audio. The model comes with built-in tool calling, so you can plug it into web search, RAG workflows, and custom tools using JSON schemas.

Grok Leads GenAI Traffic Growth with 14.74% Surge in November

Grok posted the strongest month-over-month traffic jump among major GenAI platforms in November, with a 14.74% increase. Meanwhile, established players like ChatGPT and Claude.ai saw their visitor numbers drop during the same period, according to Similarweb data.

⬤ Grok Voice Agent is heating up the already competitive speech AI space. It supports telephony through providers like Twilio and Vonage, handles 100+ languages, and offers multiple voice options. Whether you're building voice assistants, phone agents, or interactive voice apps, this model's got the chops. Its benchmark performance shows that speech-based reasoning is becoming the real differentiator in next-gen AI systems, and xAI is making a serious play in the voice AI market.

News Source

#AI #AI News #Grok #xAI

Usman Salis E-mail

Usman has been in the blockchain space for 9 years and written dozens of articles about crypto in his career. He wants to put crypto on the global map.