Huawei Researchers Introduce CLI-Gym: 1,655-Task AI Training Framework

Researchers from Huawei and partner institutes introduced CLI-Gym, a framework designed to automatically generate command-line troubleshooting tasks for AI agents. The system created a dataset of 1,655 tasks and helped the LiberCoder model reach 46.1% on Terminal-Bench.

Contents

How Environment Inversion Creates 1,655 Training Tasks
LiberCoder Achieves 46.1% Pass Rate on Terminal-Bench

Researchers from Huawei Technologies, the Beijing Institute of Technology, and the Chinese Academy of Sciences have rolled out CLI-Gym, a training framework that teaches AI agents to navigate command-line environments. Unlike traditional approaches that depend on manually labeled datasets, this system creates realistic troubleshooting scenarios by starting with healthy software environments and deliberately breaking them.

How Environment Inversion Creates 1,655 Training Tasks

The core innovation is something researchers call "agentic environment inversion." The system explores working environments, then reverses them into faulty states to simulate real-world problems. These degraded states come packaged with commands, configurations, and error messages, producing ready-to-use training challenges. This approach generated 1,655 environment-intensive tasks, making it the largest dataset focused on command-line interaction and system troubleshooting.

LiberCoder Achieves 46.1% Pass Rate on Terminal-Bench

The dataset trained a model family called LiberCoder, which includes versions with 32 billion and 235 billion parameters. Results show the model hit 46.1%. Terminal-Bench 1.0, beating several baseline models designed for coding and system tasks. The benchmark compared performance against DeepSeek-V3, Qwen-Coder variants, and other large language models, with LiberCoder showing strong results when trained on CLI-Gym task trajectories.

CLI-Gym reflects a broader shift in AI research toward scalable training environments. As agents move beyond text generation into actual system operations, tools that generate large volumes of realistic technical tasks become essential for improving reliability and debugging capabilities. The development fits within Huawei's ongoing push across AI hardware and software, which has recently included announcements about the Ascend 950 AI chip and performance improvements in the Kirin 9030 processor.

News Source

#AI #AI News #Huawei

Saad Ullah E-mail Twitter Facebook

Saad Ullah - engineer and writer passionate about AI, blockchain, and the disruptive technologies driving fintech innovation.