⬤ Grok 4 is straight-up dominating right now, sitting at the top of the AI performance leaderboard with the strongest gains we've seen since late November. We're looking at approximately 8.2% in cumulative returns over this period, and the trajectory on that chart just keeps climbing while everything else is playing catch-up. It's not even close when you compare it to both the competing AI models and the broader market benchmark.
⬤ Here's how the rest of the field stacks up: DeepSeek V3 is sitting in second place with about 5.9% gains, while GPT-5.1 managed roughly 4.2% over the same timeframe. Both of these are beating the S&P 500's ~3% increase, which is solid, but they're still nowhere near Grok 4's performance. The other models in the ranking—GPT-5.2, Claude Sonnet 4.5, and Opus 4.5—all posted positive returns too, but we're talking smaller percentage gains across the board.
⬤ What's really interesting is that the performance table breaks down way more than just headline numbers. You've got total account value, win rates, and risk-adjusted metrics like Sharpe ratio all laid out there. While some of the competing models took bigger hits during drawdowns, Grok 4's equity curve looks remarkably stable by comparison. That consistency is a huge part of why it's maintaining this lead over everything else in the rankings.
⬤ The whole setup here is fascinating because it's treating AI model performance like we'd evaluate any trading strategy or portfolio manager. These models are being benchmarked against each other and directly compared to a major stock index like the S&P 500. The fact that they're including traditional market benchmarks shows how these experimental AI-driven portfolios are being judged through the same financial performance lens we use for actual investment strategies, not just some abstract technical metrics.
Eseandre Mordi
Eseandre Mordi