DeepSeek May Have Found a New Way to Improve AI’s Ability to Remember

DeepSeek May Have Found a New Way to Improve AI’s Ability to Remember

MIT Technology Review
MIT Technology ReviewOct 29, 2025

Why It Matters

By reshaping how AI models remember, DeepSeek’s visual-token method could dramatically improve efficiency and scalability of large language models, reducing costs and environmental impact while opening new avenues for data generation and advanced reasoning.

Summary

DeepSeek, a Chinese AI firm, unveiled an OCR model that uses visual tokens instead of traditional text tokens to store and retrieve information, effectively compressing memory like a snapshot of a book page. The approach reduces token count, mitigating the “context rot” problem in long AI conversations and lowering computational load, which could cut AI’s carbon footprint. Early benchmarks show the model matches top OCR systems, and its tiered compression mimics human memory fading, allowing older content to remain accessible in a blurred form. The technique also promises to generate massive training data, reportedly over 200,000 pages per day on a single GPU, addressing the industry’s text data shortage.

DeepSeek may have found a new way to improve AI’s ability to remember

Comments

Want to join the conversation?

Loading comments...