● In a post by Niels Rogge, the newly launched Datalab Chandra v1.0 was shown leading OCR (Optical Character Recognition) benchmarks with a score of 83.1 — the highest of all competitors. The results show Chandra beating major AI models, including GPT-4o (69.9), Gemini Flash 2 (63.8), and DeepSeek OCR (75.4).
● Rogge called it "probably the best model with the least hype," pointing out it offers both a serverless API and fully open-source versions on Hugging Face. Despite its technical lead, Datalab keeps a low profile — the team reportedly has just 85 followers. This gap between performance and publicity highlights how smaller research groups are now competing with well-funded AI giants.
● The release echoes comments from Tibo, co-founder of Datalab's Codex platform, who emphasized the company's focus on transparency. "We promised unprecedented transparency for Codex and to take reports of degradation seriously," he said, noting the team's active response to feedback during "incredible growth week over week." This openness contrasts sharply with the closed practices of larger AI companies.
● The impact could be substantial. Open, high-performing OCR models like Chandra v1.0 may shake up the commercial API market by cutting costs and lowering barriers for startups. Datalab's results show that technical quality and open collaboration matter more than marketing budgets.
● With Chandra's top ranking, Datalab has raised the bar for both performance and openness — proving that AI innovation doesn't always come from the biggest names, but from those most committed to transparency.
Usman Salis
Usman Salis