Exabase Achieves Highest Reported Score on Leading AI Memory Benchmark Using a Smaller, Cheaper Model

Exabase Achieves Highest Reported Score on Leading AI Memory Benchmark Using a Smaller, Cheaper Model

HPCwire
HPCwireMay 26, 2026

Key Takeaways

  • M-1 scored 96.4% on LongMemEval, highest reported
  • Used Gemini 3 Flash, 4‑6× cheaper than Gemini 3 Pro
  • Outperformed Mem0, Honcho, HydraDB, Supermemory on benchmark
  • Designed for production; powers Fabric’s AI workspace for 300k users
  • Retrieval architecture built with Hyperplane Labs, mimics episodic memory

Pulse Analysis

Long‑term memory remains one of the toughest infrastructure challenges for conversational AI. Benchmarks like LongMemEval, which tests recall, reasoning, and knowledge updates across 500 questions, have become the de‑facto standard for measuring how well an agent can retain context over time. Historically, achieving top scores required massive, expensive models that are impractical for most enterprises, limiting the commercial rollout of truly persistent AI assistants.

Exabase’s M-1 engine flips that paradigm by delivering state‑of‑the‑art performance with Gemini 3 Flash, a model that is four to six times cheaper and faster than the Gemini 3 Pro used by rivals. The system’s retrieval layer, engineered with Hyperplane Labs, treats memory as a reconstructive process, drawing on episodic memory theory to improve relevance without relying on brute‑force scaling. This design not only secured a 96.4% top‑50 retrieval accuracy but also proved production‑ready, already handling the memory needs of Fabric’s AI workspace for more than 300,000 users.

The broader impact is significant: developers can now embed high‑fidelity memory into agents without prohibitive compute budgets, lowering the entry threshold for sophisticated AI products. As more platforms adopt Exabase’s API, the competitive landscape may shift away from heavyweight, proprietary solutions toward modular, cost‑effective memory services. In the long run, this could accelerate the adoption of AI agents across sectors such as customer support, knowledge management, and personalized digital assistants, driving both innovation and market growth.

Exabase Achieves Highest Reported Score on Leading AI Memory Benchmark Using a Smaller, Cheaper Model

Comments

Want to join the conversation?