Lowering the Cost of Intelligence With NVIDIA's Ian Buck - Ep. 284
AI

The AI Podcast (NVIDIA)

Lowering the Cost of Intelligence With NVIDIA's Ian Buck - Ep. 284

The AI Podcast (NVIDIA)Dec 29, 2025

AI Summary

In this episode, Ian Buck of NVIDIA explains how mixture‑of‑experts (MoE) architectures allow AI models to become more capable without a linear rise in compute costs, using analogies that make the concept accessible. He highlights the hidden complexities of MoE, such as routing and load balancing, and stresses that achieving their promise requires extreme co‑design across hardware, networking, and software stacks. Buck also shares real‑world examples where MoE has delivered cost‑effective performance gains, illustrating the strategic importance of integrated system design for next‑generation AI.

Episode Description

Discover how mixture‑of‑experts (MoE) architecture is enabling smarter AI models without a proportional increase in the required compute and cost. Using vivid analogies and real-world examples, NVIDIA’s Ian Buck breaks down MoE models, their hidden complexities, and why extreme co-design across compute, networking, and software is essential to realizing their full potential.  Learn more: https://blogs.nvidia.com/blog/mixture-of-experts-frontier-models/

Show Notes

Discover how mixture‑of‑experts (MoE) architecture is enabling smarter AI models without a proportional increase in the required compute and cost. Using vivid analogies and real-world examples, NVIDIA’s Ian Buck breaks down MoE models, their hidden complexities, and why extreme co-design across compute, networking, and software is essential to realizing their full potential.  Learn more: https://blogs.nvidia.com/blog/mixture-of-experts-frontier-models/

Comments

Want to join the conversation?

Loading comments...