In this episode, Aravind Suresh, head of OpenAI's real‑time infrastructure team, explains how the company built a highly reliable, scalable streaming backbone for products like ChatGPT using Kafka and Flink. He describes the challenges of scaling a streaming platform tenfold every six‑seven months, the need to hide infrastructure complexity from internal users, and the design choices—such as proxy‑based multi‑cluster Kafka and relaxed ordering guarantees—that enabled high availability and simplicity. The conversation highlights the importance of building simple, well‑abstracted systems that can evolve with rapid growth, and how OpenAI’s approach balances velocity with stability for mission‑critical AI workflows.
In this episode, Tim talks with Gunnar Morling, a principal technologist at Confluent and a key contributor to projects like Hibernate and Debezium, about his "One Billion Row Challenge"—a viral coding contest he launched for the Java community in January...
In this episode Tim Berglund talks with Colt McNealy, founder and CEO of Little Horse, about building a Kafka‑based platform for orchestrating microservice workflows and AI agents. Colt describes how his early experience debugging monolithic code with GDB contrasted with...
In this episode, Viktor Gamov interviews Jeremy Custenborder of Confluent about his journey from a paper boy to a leader in large‑scale systems, focusing on his experience keeping MySpace operational at massive pre‑cloud scale. Jeremy explains how he built custom...

In this episode, Tim Berglund chats with data infrastructure veteran Richie Artoul about his unconventional path—from running a LAN gaming café to building log storage at Datadog and now leading WarpStream at Confluent. Richie shares the technical and cultural challenges...

In this episode, Adi Polak interviews Bryan Oliver of Thoughtworks about his journey from building swimming pools to engineering massive GPU racks for AI workloads. Oliver explains the technical and operational challenges of running $3M GPU data centers, focusing on...

In this episode, Tim Berglund interviews Sophie Blee‑Goldman of Responsive about her journey from a Google internship to becoming a specialist in container orchestration and Kafka Streams. They dive into the technical challenge of scaling a Kafka Streams application for...

In this episode, Viktor Gamov interviews Dhiraj Suri of Confluent about his journey from a software developer at NetApp to a systems engineering leader focused on stream governance. Dhiraj explains how he tackled the challenge of integrating fragmented tools at...