A recent comparison has sparked debate about what "intelligence" means for AI models. The test focused on a deceptively simple task: accurately reading and reconstructing a New York City MTA timetable from an image.
Haiku 4.5 vs GPT-5: The OCR Challenge
Jerry Liu ran an experiment where both models performed optical character recognition on a dense train schedule. The results were striking. GPT-5 struggled with column spacing, merging times and flattening the layout until the data lost its structure. Haiku 4.5 nearly perfectly recreated the timetable, preserving columns, rows, and spacing with minimal errors.
Liu's takeaway was direct: "Better reasoning doesn't correlate to visual understanding." Language reasoning and spatial perception are distinct skills—excelling at one doesn't guarantee competence in the other.
Why This Matters
While GPT-5 dominates abstract reasoning and problem-solving, Haiku's design is better suited for visual parsing and structured-data recognition. In real-world scenarios like financial reports or legal documents, this precision is critical. Even small OCR errors can distort analysis pipelines. This demonstrates how smaller, specialized models can outperform massive general-purpose systems in targeted tasks.
A Lightweight Contender in Document AI
Haiku 4.5 is available through LlamaCloud, developed by LlamaIndex founder Jerry Liu, with easy integration. Its balance of accuracy and efficiency makes it ideal for companies seeking scalable document tools without GPT-5's heavy resource demands.
As Liu noted, Haiku is a "lightweight contender for document parsing" that combines precision with practicality.
Beyond Reasoning: The Next AI Frontier
This comparison highlights a crucial shift in AI: moving from pure reasoning toward multimodal understanding, where models must interpret text, spatial relationships, images, and layouts. While GPT-5 leads in analytical reasoning, Haiku 4.5 shows that specialization—not scale—may drive the next wave of AI innovation.
Peter Smith
Peter Smith