Build Hour: Agent Memory Patterns

OpenAI
OpenAIDec 4, 2025

Why It Matters

Mastering agent memory patterns enables businesses to deploy long‑running AI assistants that stay on context, reduce token costs, and avoid costly errors, directly impacting product reliability and scalability.

Summary

The Build Hour session, hosted by Michaela from OpenAI’s startup marketing team and featuring solution architects Emry and Brian, focused on “agent memory patterns” – a deep dive into context engineering for long‑running AI agents. The presenters framed context engineering as both an art, requiring judgment about what information matters, and a science, built on repeatable patterns that shape what the model sees. They positioned it as a broader discipline encompassing prompt engineering, retrieval, and memory management, essential for scaling AI‑driven products.

Key insights covered three core memory strategies: “reshape and fit” (trimming, compaction, summarization to stay within token limits), “isolate and route” (offloading context to sub‑agents or selective handoffs), and “extract and retrieve” (building short‑term versus long‑term memory across sessions). Emry highlighted the finite token budget as a bottleneck, illustrating how unchecked context can lead to bursts, conflicts, poisoning, and noise. Failure modes were visualized with concrete examples, such as a sudden 3,000‑token spike when a refund‑policy tool call flooded the prompt, and contradictory instructions causing the agent to issue an unintended refund.

The live demo showcased a dual‑agent troubleshooting app for laptop issues, where the absence of memory caused the agent to repeat questions, while the memory‑enabled version retained earlier details like Wi‑Fi and overheating problems, delivering a more coherent experience. Emry emphasized that “the core bottleneck is context is finite,” and demonstrated how strategic tool definition and selective injection of information can prevent context bursts. The session also introduced a taxonomy of agent “context profiles” – RAG‑heavy, tool‑heavy, and conversational concierge – each with distinct static and dynamic token components.

Implications for developers are clear: effective context engineering is critical to building reliable, scalable agents that can maintain continuity across interactions without exhausting token limits. By applying the outlined best practices—lean system prompts, canonical examples, minimal tool overlap, and disciplined memory extraction—teams can maximize signal‑to‑noise, reduce hallucinations, and deliver higher‑quality outcomes in production AI systems.

Original Description

AI agents don’t just reason — they remember. In this Build Hour, we deep-dive into context engineering techniques that enable agents to maintain short-term and long-term memory, personalize interactions, and operate reliably across long-running workflows.
Emre Okcular (Solutions Architect) covers:
• Why memory matters: stability, personalization, and long-running agent workflows
• Short-term memory patterns: Sessions, context trimming, compaction, summarization
• Long-term memory patterns: state objects, structured notes, memory-as-a-tool
• Architectures: token-aware sessions, state injection strategies, guardrails, and memory triggers
• Live demo: building an end-to-end agent with dynamic short and long term memory
• Best practices: avoiding context poisoning, context burst, context noise and context conflict.
• Live Q&A
👉 OpenAI Agents Python SDK: https://openai.github.io/openai-agents-python/
👉 Context Summarization with Realtime Cookbook: https://cookbook.openai.com/examples/context_summarization_with_realtime_api
👉 Follow along with the code repo: https://github.com/openai/build-hours
👉 Sign up for upcoming live Build Hours: https://webinar.openai.com/buildhours/
00:00 Context Engineering
10:44 Context Lifecycle Demo
20:13 Context Engineering Techniques
26:49 Reshape + Fit Demo
39:16 Conclusion
42:45 Q&A

Comments

Want to join the conversation?

Loading comments...