OpenAI Sora 2 Team: How Generative Video Will Unlock Creativity and World Models

Sequoia Capital
Sequoia CapitalNov 5, 2025

Summary

OpenAI’s Sora team unveiled Sora 2, a next‑generation generative video model that uses diffusion transformers and space‑time tokens to simulate entire video sequences with physics‑consistent behavior. By treating video as a world simulator, Sora 2 can maintain object permanence and produce realistic motion, avoiding the over‑optimistic errors of earlier models. The team emphasized an iterative rollout strategy to let society adapt to powerful simulation tech, and highlighted the model’s broad data mix—from real footage to anime—to build robust internal world models. They also speculated that future, more capable simulators could eventually replace physical labs for scientific experimentation.

Original Description

The OpenAI Sora 2 team (Bill Peebles, Thomas Dimson, Rohan Sahai) discuss how they compressed filmmaking from months to days, enabling anyone to create compelling video. Bill, who invented the diffusion transformer that powers Sora and most video generation models, explains how space-time tokens enable object permanence and physics understanding in AI-generated video, and why Sora 2 represents a leap for video. Thomas and Rohan share how they're intentionally designing the Sora product against mindless scrolling, optimizing for creative inspiration, and building the infrastructure for IP holders to participate in a new creator economy. The conversation goes beyond video generation into the team’s vision for world simulators that could one day run scientific experiments, their perspective on co-evolving society alongside technology, and how digital simulations in alternate realities may become the future of knowledge work.
Hosted by: Konstantine Buhler and Sonya Huang, Sequoia Capital

Comments

Want to join the conversation?

Loading comments...