⬤ Google Research has unveiled a fresh take on how AI models handle memory. The team introduced a combination of Titans architecture and MIRAS that enables something called "test time memorization." This setup lets models update their long-term memory while they're actually running, using a surprise-based mechanism to figure out what's worth remembering and what can be forgotten.
⬤ What makes this different is timing. Unlike traditional training where models learn once and then stay fixed, this approach works during inference—when the model is actively responding to you. Most large language models hit a wall with context windows, basically forgetting earlier parts of conversations once they run out of space. Titans plus MIRAS changes that dynamic by letting models decide in real time what information deserves to stick around.
⬤ If this scales up, we might stop obsessing over context length entirely. Instead of building bigger and bigger context windows, models could just update their internal memory on the fly. That said, Google hasn't shared performance numbers, deployment plans, or real-world applications yet. This is still firmly in research territory, not something you'll see in production tomorrow.
⬤ There's also a trust question lurking here. When a model can rewrite its own long-term memory mid-conversation, things get complicated. How do you track what it remembers? How do you know it won't develop unexpected behaviors? The research doesn't make promises about outcomes, but it does raise important questions about transparency and control. As this technology develops, the industry will need to figure out how to balance adaptive memory with reliability and predictability.
Saad Ullah
Saad Ullah