Google PM Open-Sources Always On Memory Agent, Ditching Vector Databases for LLM-Driven Persistent Memory

•March 6, 2026

VentureBeat•Mar 6, 2026

Why It Matters

The project demonstrates a scalable path to persistent AI memory at low token cost, but forces enterprises to confront compliance, drift, and auditability of autonomous agents.

Key Takeaways

•Open‑source agent replaces vector DB with LLM‑driven memory
•Uses Gemini 3.1 Flash‑Lite, $0.25 per million input tokens
•Persistent SQLite storage enables continuous ingestion and consolidation
•Governance and drift become primary enterprise concerns
•ADK supports multi‑deployment, making memory layer reusable

Pulse Analysis

Persistent memory has long been a stumbling block for production AI agents. Traditional stacks rely on embedding pipelines, vector databases, and complex indexing, inflating both operational overhead and latency. By moving the retrieval logic into the LLM itself, Google’s Always On Memory Agent sidesteps these layers, offering a leaner architecture that is especially attractive for startups and midsize teams seeking rapid prototyping without heavyweight infrastructure.

The open‑source reference implementation leverages Gemini 3.1 Flash‑Lite, a model priced at $0.25 per million input tokens and $1.50 per million output tokens. Flash‑Lite’s 2.5× speed advantage and strong benchmark scores make it economically viable for a 24/7 service that repeatedly reads, thinks, and writes to a SQLite‑backed memory store. The agent runs continuously, ingesting text, images, audio, video, and PDFs, then consolidates structured memories every 30 minutes. This design shifts performance bottlenecks from vector search to model latency and memory compaction logic, delivering predictable costs for high‑frequency workloads.

For enterprises, the real test lies beyond speed and price. Persistent agents raise questions about data governance, drift, and audit trails—issues highlighted by industry commentators who warn of “compliance nightmares” when memories evolve autonomously. The ADK framework’s support for multi‑deployment, tool‑calling, and evaluation hooks provides a foundation, yet organizations will need deterministic policies, retention guarantees, and robust monitoring to trust such systems in production. As AI moves toward long‑term, context‑aware assistants, Google’s memory agent offers a compelling blueprint, but its adoption will hinge on how effectively firms can impose governance without eroding the economic benefits.