Big Data Podcasts

Inside OpenAI’s Streaming Backbone with Aravind Suresh | Ep. 24
PodcastMar 23, 202630 min

Inside OpenAI’s Streaming Backbone with Aravind Suresh | Ep. 24

In this episode, Aravind Suresh, head of OpenAI's real‑time infrastructure team, explains how the company built a highly reliable, scalable streaming backbone for products like ChatGPT using Kafka and Flink. He describes the challenges of scaling a streaming platform tenfold...

By Streaming Audio (Kafka / Confluent)
#352 AI Agents at Work: What Actually Breaks (and How to Fix It) with Danielle Crop, EVP Digital Strategy &...
PodcastMar 23, 202656 min

#352 AI Agents at Work: What Actually Breaks (and How to Fix It) with Danielle Crop, EVP Digital Strategy &...

In this episode, Danielle Crop, EVP of Digital Strategy & Alliances at WNS, discusses the rapid rise of AI agents in enterprises, emphasizing the need to evaluate whether they deliver real value and operate securely. She advocates a balanced mindset...

By DataFramed
DuckDB, AI, and the Future of Data Engineering
PodcastMar 18, 20260 min

DuckDB, AI, and the Future of Data Engineering

In this episode, Dan Beach chats with State Farm staff engineer Matt Martin about his journey from industrial engineering to data engineering, his deep involvement with DuckDB, and the evolving landscape of data platforms. Matt shares how early automation with...

By Data Engineering Central
The 1 Billion Row Challenge with Gunnar Morling | Ep. 23
PodcastMar 16, 202630 min

The 1 Billion Row Challenge with Gunnar Morling | Ep. 23

In this episode, Tim talks with Gunnar Morling, a principal technologist at Confluent and a key contributor to projects like Hibernate and Debezium, about his "One Billion Row Challenge"—a viral coding contest he launched for the Java community in January...

By Streaming Audio (Kafka / Confluent)
Re-Air: Data Tools, Templates, and the Trouble with “Easy” Solutions with the Cynical Data Guy
PodcastMar 11, 202641 min

Re-Air: Data Tools, Templates, and the Trouble with “Easy” Solutions with the Cynical Data Guy

In this re‑aired episode, hosts Eric Dotz and John Wessel chat with regular guest Matt, the Cynical Data Guy, about the rise of low‑code data tools like Clay and the evolving role of the “GT‑M engineer.” They debate whether such...

By The Data Stack Show
The Iceberg Ecosystem Today (W/ Anders Swanson)
PodcastMar 8, 202654 min

The Iceberg Ecosystem Today (W/ Anders Swanson)

In this episode, Anders Swanson, a developer experience advocate at dbt Labs, walks through the current state of the Apache Iceberg ecosystem, covering how open‑source and cloud vendors are converging on shared standards, the rise of external catalog integrations, and...

By The Analytics Engineering Podcast
AEC’s Single Source of Truth: Reality or Pipe Dream?
PodcastMar 6, 202642 min

AEC’s Single Source of Truth: Reality or Pipe Dream?

In this episode the hosts explore whether a true single source of truth (SSOT) for construction project data is achievable or merely aspirational. NuFORMA’s Dave Wagner and Carl Beillette argue that a single vendor solution is unrealistic; instead, the goal...

By AEC Business
🎥 MSCI's Luke Flemmer - "Bringing Clarity to Investment Decisions"
PodcastFeb 26, 20260 min

🎥 MSCI's Luke Flemmer - "Bringing Clarity to Investment Decisions"

In this episode, Luke Flemmer, head of private assets at MSCI, explains how standardizing and normalizing data can unlock transparency, price formation, and liquidity in private markets, drawing parallels to past evolutions in bonds, FX, and equities. He argues that...

By Alt Goes Mainstream
Killing Clusters & Orchestrating Chaos with Colt McNealy  | Ep. 20
PodcastFeb 23, 202638 min

Killing Clusters & Orchestrating Chaos with Colt McNealy | Ep. 20

In this episode Tim Berglund talks with Colt McNealy, founder and CEO of Little Horse, about building a Kafka‑based platform for orchestrating microservice workflows and AI agents. Colt describes how his early experience debugging monolithic code with GDB contrasted with...

By Streaming Audio (Kafka / Confluent)
#347 Let's Get Physical with AI with Ivan Poupyrev, CEO at Archetype AI
PodcastFeb 23, 202645 min

#347 Let's Get Physical with AI with Ivan Poupyrev, CEO at Archetype AI

In this episode, Ivan Poupyrev, CEO of Archetype AI, explains that "physical AI" goes far beyond robotics, embedding foundation‑model intelligence into everyday devices—from washing machines to HVAC systems—and enabling them to communicate and optimize as a unified system. He outlines...

By DataFramed
Petra Durnin: You Don't Need More Tech — You Need Better Data
PodcastFeb 18, 202648 min

Petra Durnin: You Don't Need More Tech — You Need Better Data

In this episode, Petra Durnin, a veteran CRE researcher and tech‑to‑impact strategist, explains why the industry’s biggest hurdle isn’t more tools but cleaner, more integrated data. She walks through her career trajectory, from a temp analyst to leading data and...

By The Crexi Commercial Real Estate Podcast | CRE Insights & Strategies
Data Is the New Oil, and Your Database Is the only Way to Extract It
PodcastFeb 17, 202640 min

Data Is the New Oil, and Your Database Is the only Way to Extract It

In this episode, Ryan interviews Shireesh Thota, Corporate Vice President of Azure Databases at Microsoft, about the rapid evolution of Microsoft's database offerings, including SQL Server, Cosmos DB, and Postgres, and how they fit into a unified Azure data platform....

By Stack Overflow Podcast
Driving Safer AVs Faster with Smart Simulation, Neural Reconstruction, and Data-Centric Tools - Ep. 289
PodcastFeb 11, 202645 min

Driving Safer AVs Faster with Smart Simulation, Neural Reconstruction, and Data-Centric Tools - Ep. 289

In this episode, Rohan Bhasin of Fortellix and Dan Gural of Voxel51 discuss how autonomous‑vehicle (AV) teams can transform massive drive‑log datasets into high‑fidelity simulations using neural reconstruction, scenario‑driven data curation, and NVIDIA‑accelerated pipelines. They explain how these tools enable...

By The AI Podcast (NVIDIA)
Re-Air: Data Teams at the Crossroads: Proving Value in a Changing Business Landscape with Ben Rogojan
PodcastFeb 11, 202652 min

Re-Air: Data Teams at the Crossroads: Proving Value in a Changing Business Landscape with Ben Rogojan

In this re‑aired episode, John interviews Ben Rogojan, owner of Seattle Data Guy, about how data teams can demonstrate value amid tighter budgets and rapid AI advances. They discuss shifting from output‑focused metrics like dashboards to outcome‑driven results, the importance...

By The Data Stack Show
Fail Fast & Ship It with Jeremy Custenborder | Ep. 18
PodcastFeb 9, 202627 min

Fail Fast & Ship It with Jeremy Custenborder | Ep. 18

In this episode, Viktor Gamov interviews Jeremy Custenborder of Confluent about his journey from a paper boy to a leader in large‑scale systems, focusing on his experience keeping MySpace operational at massive pre‑cloud scale. Jeremy explains how he built custom...

By Streaming Audio (Kafka / Confluent)
#345 How to Drive Innovation with Brian Solis, Head of Global Innovation at ServiceNow
PodcastFeb 9, 20261h 7m

#345 How to Drive Innovation with Brian Solis, Head of Global Innovation at ServiceNow

In episode #345, DataFramed hosts Adel Nehme and Richie Cotton sit down with Brian Solis, Head of Global Innovation at ServiceNow, to explore how organizations can foster a culture of continuous innovation. Solis emphasizes the importance of aligning innovation with...

By DataFramed
Airbnb’s Open-Source GraphQL Framework with Adam Miskiewicz
PodcastFeb 5, 202655 min

Airbnb’s Open-Source GraphQL Framework with Adam Miskiewicz

In this episode, Adam Miskiewicz, Principal Software Engineer at Airbnb, explains how the company built Viaduct, an open‑source data‑oriented service mesh and GraphQL platform that unifies a central schema across millions of microservices. He details the architectural principles—centralized schema, consistent...

By Software Engineering Daily – Data
#290: Always Be Learning
PodcastFeb 3, 20261h 6m

#290: Always Be Learning

In this episode, Tim Wilson, Val Kroll, and Spotify product manager/data scientist Mårten Schultzberg discuss the limits of focusing solely on win rates in experimentation and introduce a broader "learning rate" metric that captures wins, regressions (avoiding bad outcomes), and neutral...

By Digital Analytics Power Hour
From “This May Never Work” To WarpStream with Richie Artoul | Ep. 17
PodcastFeb 2, 202630 min

From “This May Never Work” To WarpStream with Richie Artoul | Ep. 17

In this episode, Tim Berglund chats with data infrastructure veteran Richie Artoul about his unconventional path—from running a LAN gaming café to building log storage at Datadog and now leading WarpStream at Confluent. Richie shares the technical and cultural challenges...

By Streaming Audio (Kafka / Confluent)
It's Friday, Juan and Tim Rant with Data Day Texas Takeaways
PodcastJan 30, 202634 min

It's Friday, Juan and Tim Rant with Data Day Texas Takeaways

In this 34‑minute episode, Juan and Tim unwind over a beer to discuss recent developments in the data landscape and share their key takeaways from Data Day Texas. They cover topics such as the hype around AI versus real monetary...

By Catalog & Cocktails
Inside $3M GPU Racks: Powering Modern AI with Bryan Oliver | Ep. 16
PodcastJan 26, 202630 min

Inside $3M GPU Racks: Powering Modern AI with Bryan Oliver | Ep. 16

In this episode, Adi Polak interviews Bryan Oliver of Thoughtworks about his journey from building swimming pools to engineering massive GPU racks for AI workloads. Oliver explains the technical and operational challenges of running $3M GPU data centers, focusing on...

By Streaming Audio (Kafka / Confluent)
From Evidence to Adoption: How datosX Is Redefining Digital Health Validation
PodcastJan 20, 202636 min

From Evidence to Adoption: How datosX Is Redefining Digital Health Validation

In this episode, Unity Stoakes interviews Robin Roberts, CEO of datosX Digital Health Labs, about transforming digital health validation from a bottleneck into a catalyst for adoption. Roberts explains how datosX leverages tier‑1 health system partnerships to run regulatory‑grade validation...

By StartUp Health NOW
#289: The Imperative of Developing Business Acumen
PodcastJan 20, 20261h 10m

#289: The Imperative of Developing Business Acumen

In episode #289 the hosts discuss the essential role of business acumen for data and analytics professionals, defining it as both a grasp of general business fundamentals (finance, marketing, P&L) and deep knowledge of one’s own organization and industry context....

By Digital Analytics Power Hour
Agent Psychosis: Are We Going Insane?
PodcastJan 19, 20266 min

Agent Psychosis: Are We Going Insane?

In this episode, Armin Ronacher warns that AI agent psychosis could be making us collectively uneasy, while Dan Abramov breaks down the AT Protocol as a social filesystem for decentralized apps. RepoBar is highlighted as a tool that surfaces your...

By Practical AI
Hacking Kafka Streams with Sophie Blee‑Goldman | Ep. 15
PodcastJan 19, 202634 min

Hacking Kafka Streams with Sophie Blee‑Goldman | Ep. 15

In this episode, Tim Berglund interviews Sophie Blee‑Goldman of Responsive about her journey from a Google internship to becoming a specialist in container orchestration and Kafka Streams. They dive into the technical challenge of scaling a Kafka Streams application for...

By Streaming Audio (Kafka / Confluent)
Teaching AI How to Forget
PodcastJan 15, 202643 min

Teaching AI How to Forget

In this episode Ben Lorica interviews Ben Luria, CEO and co‑founder of Hirundo, about the rising importance of machine unlearning for enterprise AI systems. They explore how organizations can remove or forget specific data points from trained models to comply...

By The Data Exchange
America Under Surveillance with Michael Soyfer
PodcastJan 15, 202652 min

America Under Surveillance with Michael Soyfer

In this episode, Kevin Ball talks with Institute for Justice attorney Michael Soyfer about the rapid expansion of surveillance technologies such as automated license‑plate readers, facial‑recognition cameras, and predictive policing tools across U.S. municipalities. Soyfer explains the Fourth Amendment challenges...

By Software Engineering Daily – Data
#266 The CFO’s Secret Weapon Behind Higher Business Valuations: The Data Cube with David Whitcombe, Founder and Managing Director, Data...
PodcastJan 13, 202625 min

#266 The CFO’s Secret Weapon Behind Higher Business Valuations: The Data Cube with David Whitcombe, Founder and Managing Director, Data...

In this episode, Kevin Appleby and data‑analytics expert David Whitcombe explain how a "data cube"—a unified, governed layer that pulls together ERP, CRM, and operational data—gives CFOs a single source of truth that drives higher valuations in private‑equity exits. By...

By GrowCFO Show
#534: Diskcache: Your Secret Python Perf Weapon
PodcastJan 13, 20261h 14m

#534: Diskcache: Your Secret Python Perf Weapon

In this episode Michael Kennedy talks with Vincent Warmerdam about DiskCache, a SQLite‑backed, dictionary‑like cache that persists to disk and works safely across threads and processes. They explain how DiskCache’s @cache.memoize decorator and FanoutCache sharding enable cheap, high‑performance caching for...

By Talk Python to Me
Turning Chaos Into Push-Button Provisioning with Dhiraj Suri| Ep. 14
PodcastJan 12, 202621 min

Turning Chaos Into Push-Button Provisioning with Dhiraj Suri| Ep. 14

In this episode, Viktor Gamov interviews Dhiraj Suri of Confluent about his journey from a software developer at NetApp to a systems engineering leader focused on stream governance. Dhiraj explains how he tackled the challenge of integrating fragmented tools at...

By Streaming Audio (Kafka / Confluent)