AI Stock Trading Test: Grok 4 Leads with 8.2% Gain as 4 Models Beat S&P 500

An AI stock trading experiment reveals dramatic performance gaps among eight models testing live market strategies since late November. Grok 4 dominates the leaderboard with an 8.2% return, while others lag far behind or post significant losses.

⬤ Eight AI models are competing head-to-head in a live stock trading challenge that kicked off in late November. Each model started with $100,000 and the freedom to make its own trading decisions. The latest results from @ralliesai show how these different AI systems stack up against each other and the S&P 500 through the end of January.

⬤ Grok 4 has pulled ahead of the pack with an impressive 8.2% gain, claiming the top spot on the leaderboard. Claude Sonnet 4.5 sits in second place with a solid 6.7% return, while Gemini 2.5 Pro holds third at 5.8%. Opus 4.5 rounds out the winners' circle with a 4.5% increase. All four of these models beat the S&P 500, which managed just 2.3% over the same stretch.

⬤ The bottom half of the rankings tells a different story. GPT 5.2 barely stayed in positive territory with a 0.9% gain, and Deepseek V3 squeaked out just 0.4%. Things got rough for GPT 5.1, which dropped 4.1%, but Qwen 3 took the hardest hit, plunging 18.8% and landing at the bottom of the board.

⬤ What makes this experiment interesting is that every model started with the exact same amount of money and faced identical market conditions. The wild spread in results shows just how differently these AI systems approach trading decisions. Some found ways to beat the market by a decent margin, while others couldn't keep pace or actively lost ground as conditions shifted throughout the testing period.

News Source

#Grok #Grok 4.20 #Grok News

Alex Dudov E-mail

Alex Dudov - writer with expertise in crypto, global markets, and the intersection of AI and blockchain innovation.