Stanford CS153 Frontier Systems | Mati Staniszewski From ElevenLabs on The Future of Voice Systems

Stanford Online
Stanford OnlineMay 4, 2026

Why It Matters

ElevenLabs demonstrates how community‑driven AI voice technology can turn a research frontier into a scalable product, unlocking new content creation, localization, and monetization opportunities for businesses worldwide.

Key Takeaways

  • ElevenLabs built voice AI by listening to Discord creator community.
  • Initial focus: fix AI dubbing and natural text‑to‑speech generation.
  • Leveraged open‑source models like Tortoise, improving speed and stability.
  • Launched a voice marketplace enabling users to contribute and monetize voices.
  • Product‑led growth strategy targets creators, developers, and audiobook markets.

Summary

In a Stanford CS153 Frontier Systems session, ElevenLabs CEO Mati Staniszewski outlined the company’s mission to reshape voice AI, tracing its origins from a Discord text‑to‑speech bot to a full‑stack platform for creators. He emphasized the early obsession with fixing AI dubbing—preserving speaker identity, emotion, and intonation across languages—and how that problem guided their research roadmap.

Staniszewski described a product‑led growth (PLG) approach that kept the development loop tight with Discord developers and other early adopters. By exposing a voice marketplace where users upload and monetize their own vocal profiles, ElevenLabs gathered real‑world data to refine transcription, translation, and generative speech models. The team prioritized the last‑mile text‑to‑speech challenge, leveraging open‑source breakthroughs like the Tortoise model to improve naturalness, speed, and stability.

A memorable anecdote highlighted the Polish dubbing issue: a single monotone voice narrates every character, underscoring the demand for nuanced, multi‑character audio. Staniszewski also quoted early outreach—"If dubbing was possible automatically, would you be interested?"—which revealed broader creator needs such as voice‑over corrections and script‑level voice replacement, shaping ElevenLabs’ product focus.

The discussion signals a shift from research prototypes to commercial voice tools that can automate localization, audiobook production, and dynamic content generation. As ElevenLabs scales its marketplace and API, businesses across media, education, and gaming stand to benefit from cheaper, high‑quality voice synthesis, intensifying competition in the emerging AI audio economy.

Original Description

For more information about Stanford's online Artificial Intelligence programs, visit: https://stanford.io/ai
Follow along with the course schedule and syllabus, visit: https://cs153.stanford.edu/
In week two of CS153 ("AI Coachella"), Anjney Midha interviews Mati Staniszewski, founder and CEO of ElevenLabs, tracing the company’s origins from an early Discord text-to-speech bot to a fast-growing frontier audio and speech platform.
Mati explains ElevenLabs’ initial focus on solving AI dubbing inspired by Poland’s single-voice film narration, the shift to prioritizing emotional, natural-sounding text-to-speech for creators, and the evolution from cascaded pipelines (transcription, translation/LLM, and speech generation) toward real-time voice agents.
They discuss tradeoffs between cascaded versus fused multimodal systems, efforts to detect and convey emotion, safety and voice authentication limits, on-device model deployment, collaboration with teams like Sesame, and business lessons on PLG plus enterprise deployment, team structure, pricing from customer value, and growth to over $430M revenue with ~450 employees.
Guest Speaker:
Mati Staniszewski is the CEO and co-founder of ElevenLabs, the AI voice/audio platform. Born in 1995 in a town outside Warsaw, Poland, he attended Copernicus Bilingual High School in Warsaw before earning a degree in mathematics from Imperial College London. While at Imperial, he organized Mathscon, a UK student-led mathematics conference. His earlier career included roles at Opera Software, BlackRock (where he worked in the Portfolio Analytics Group and helped launch the Aladdin Wealth platform), and Palantir Technologies (as a Deployment Strategist managing large-scale public- and private-sector implementations). In 2022, he co-founded ElevenLabs with his high school friend Piotr Dabkowski. He has raised hundreds of millions from investors, including Sequoia, Andreessen Horowitz, and Salesforce Ventures, with the company valued at $11 billion as of February 2026. He joined the board of Klarna in 2025 and was named to Forbes 30 Under 30 Europe in 2024 and TIME's 100 Most Influential People in AI in 2025.

Comments

Want to join the conversation?

Loading comments...