⬤ Elon Musk's xAI has rolled out the Grok 4.20 API, quietly expanding its developer platform with multiple large-context models. First flagged by XFreeze on the platform's pricing page, the update brings three new variants built to handle different AI workloads. All of them support a 2-million token context window - one of the largest available in any publicly accessible language model right now.
⬤ The three new models are grok-4.20-multi-agent-beta, grok-4.20-beta-reasoning, and grok-4.20-beta-non-reasoning. Each is built for a distinct use case: collaborative agent networks, deep reasoning pipelines, and high-speed inference. With 2 million tokens per request, developers can feed in thousands of document pages, full codebases, or marathon conversations without breaking them into chunks.
⬤ Pricing follows a usage-based structure. For prompts under 128K tokens, the rate is $2.00 per million input tokens and $6.00 per million output tokens. Infrastructure limits include up to 4 million tokens per minute throughput and around 607 requests per minute, positioning these models for enterprise automation and agent-scale deployments.
⬤ The Grok 4.20 launch is part of a wider push across the AI industry toward bigger context windows, faster inference, and modular agent architectures. xAI is competing directly with other major LLM providers, and this release signals the company is not slowing down its platform expansion as the generative AI race heats up.
Usman Salis
Usman Salis