⬤ Recent findings from AI Digest and METR show AI models hitting a new gear when it comes to coding tasks. The story of 2025 isn't about slow, steady improvement—it's about acceleration. Leading systems like Opus 4.5, GPT-5.1 Codex-Max, GPT-5, and Grok 4 are now handling coding work that takes hours to complete, not just seconds or minutes. That's a clear jump from where things stood even a year ago.
⬤ The chart backing this up shows an obvious turning point. From 2024 to 2025, the doubling rate for AI task duration dropped to four months, compared to seven months across the 2019–2025 period. Two trend lines tell the story: the old, gradual curve and a much steeper new one. Earlier models like GPT-2, GPT-3, GPT-3.5, and GPT-4 sit near the bottom of the scale, while newer systems are pushing toward 1.7 hours, 3.3 hours, and even 5 hours of autonomous task execution.
⬤ Just a year back, Gemini 1.5 was still the go-to reference, multimodal features were hit-or-miss, and agent-style systems like OpenAI's o1 were barely out of the lab. Fast forward to 2025: multimodal understanding got sharper, agent-like behavior started scaling up, and open-source models began closing the gap. Work that used to take engineers months now wraps up in hours or minutes—a real-world reflection of what the data is showing.
⬤ This matters because when AI capabilities jump this fast, the ripple effects are huge. We're talking shifts in productivity expectations, workflow automation, competitive pressure, and how quickly new tech gets adopted. If this pace holds into 2026, AI might stop just helping out and start running entire workflows on its own—forcing organizations to rethink how fast they can move, how efficient they can get, and what innovation looks like across AI-powered industries.
Saad Ullah
Saad Ullah