Two AI Models Built Working CLI Tools in Under 15 Minutes—No Human Help Required

GLM 4.7 and MiniMax M2.1 autonomously created command-line task runners in 10-14 minutes, handling complex coding tasks that typically take senior developers 1-2 days to complete.

⬤ Two open-weight AI models just proved they can build production-ready software completely on their own. In a recent test by Kilo Code, GLM 4.7 and MiniMax M2.1 were challenged to create CLI task runners from scratch—tools that handle YAML parsing, topological sorting with cycle detection, process management, and file hashing. Both models knocked it out in 10-14 minutes flat.

⬤ GLM 4.7 went all-in with a 741-line architecture plan, ultimately generating 1,850 lines of code across 18 files. It included a thorough 363-line README and complete documentation for $0.30. MiniMax M2.1 took a leaner approach with a 284-line plan and 9 files in a flat structure. While it skipped the README, it impressed by catching and fixing its own parsing bug during testing—all for just $0.15.

"The models showcased their ability to autonomously plan, code, debug, and test—skills that once required significant human expertise."

⬤ Here's the kicker: despite their different approaches, both models nailed all 20 requirements and produced functionally identical results. GLM 4.7 delivered more modular, well-documented code, while MiniMax M2.1 focused on efficiency and cost savings. Either way, watching AI systems plan, build, and debug complex software without human intervention marks a real shift in what's possible.

MiniMax M2.1 vs GLM-4.7: 74.0 vs 73.8 on SWE-bench

First benchmark results put MiniMax M2.1 and GLM-4.7 neck-and-neck on SWE-bench Verified, though MiniMax pulls ahead on Terminal Bench 2.0. Keep in mind these are vendor-reported numbers, not independent tests.

⬤ This comparison shows how AI development is becoming more accessible. With tools like Kilo Code's Parallel Mode, developers can now run multiple AI models side-by-side to find the sweet spot between performance and budget. What used to require days of senior developer time can now be tested and deployed in minutes—opening new possibilities for businesses and developers alike.

News Source

#AI News #GLM 4.7 #MiniMax M2.1

Usman Salis E-mail

Usman has been in the blockchain space for 9 years and written dozens of articles about crypto in his career. He wants to put crypto on the global map.