● According to Robert Youssef, DeepAgent is "the biggest leap in AI agents since ReAct." The newly released paper from Renmin University of China and Xiaohongshu Inc., titled "DeepAgent: A General Reasoning Agent with Scalable Toolsets," introduces a model that can reason autonomously, discover tools, and take action—all without preset workflows or fixed tool lists.
● DeepAgent uses a technique called Memory Folding, which lets the model "compress" past thoughts into structured episodic, working, and tool memories. It's similar to how our brains organize and recall experiences, helping the agent stay contextually aware across complex, multi-step tasks. The flip side? More autonomy means more unpredictability. In dynamic environments, that can make oversight, safety checks, and debugging much harder at scale.
●The team also developed ToolPO—a reinforcement learning method that rewards agents not just for finishing tasks, but for how efficiently they use tools along the way. Unlike older frameworks like ReAct and AutoGPT, which rely on rigid step-by-step interactions, this approach could cut computational costs while boosting productivity in AI systems handling complex reasoning, workflow automation, or digital operations.
● DeepAgent beat GPT-4-class reasoning agents on nearly every tested environment—WebShop, ALFWorld, GAIA, and HLE—even when working with tools it had never seen before. This is a big step toward general-purpose reasoning AI that can remember, adapt, and evolve dynamically—traits we've long considered uniquely human.
Alex Dudov
Alex Dudov