⬤ Recursive Self-Aggregation paired with Gemini 3 Flash has become one of the top performers in the latest ARC-AGI-2 public evaluation, achieving a 59.31% score that places it among the leading reasoning systems. The accompanying chart positions this result against both performance and cost, showing how the model reaches near-human performance without the hefty price tag of top-ranked alternatives.
⬤ The data reveals that RSA with Gemini 3 Flash outpaces several pricier options when you factor in cost efficiency. While models like Gemini 3 Deep Think and GPT-5 Pro appear at much higher cost points on the chart, their scores don't justify the premium. Gemini 3 Flash with RSA matches GPT-5.2 xHigh's performance while running at dramatically lower costs per task, proving that strong reasoning doesn't require a premium budget.
⬤ The chart shows a clear performance ladder across RSA Low, Medium, and High variants using Gemini 3 Flash. As costs rise slightly, scores climb steadily, approaching the average human benchmark. This scaling happens efficiently without complex scaffolding or code refinement pipelines that some higher-scoring models depend on.
⬤ These results matter for the AI market because they highlight a shift toward cost-aware reasoning benchmarks. The ARC-AGI-2 evaluation increasingly rewards the balance between accuracy and economic efficiency. Recursive Self-Aggregation with Gemini 3 Flash captures this trend perfectly, delivering competitive reasoning performance that works for both research labs and production environments.
Eseandre Mordi
Eseandre Mordi