Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents

•January 21, 2026

MarkTechPost•Jan 21, 2026

Companies Mentioned

LiveKit

X (formerly Twitter)

Why It Matters

Ultra‑low latency and cost‑effective pricing enable scalable, interactive voice assistants across consumer and enterprise markets, giving developers a reliable foundation for real‑time conversational experiences.

Key Takeaways

•P90 latency under 250 ms (Max) and 130 ms (Mini)
•Expressiveness up 30%, stability up 40% lower WER
•Pricing $5‑$10 per million characters, cheap per minute
•Supports 15 languages, instant & professional voice cloning
•Available cloud API and on‑prem, integrates with LiveKit, Pipecat

Pulse Analysis

The text‑to‑speech landscape has long grappled with the trade‑off between latency and naturalness, especially for interactive agents that must respond as quickly as a chatbot’s text output. Inworld’s TTS‑1.5 tackles this head‑on by optimizing the P90 time‑to‑first‑audio metric, delivering sub‑250 ms responses for the Max model and sub‑130 ms for the Mini variant. This speed aligns TTS latency with modern GPU‑accelerated language models, ensuring seamless voice‑first experiences in gaming, virtual assistants, and customer‑support bots.

Beyond raw speed, TTS‑1.5 pushes the envelope on expressive fidelity and operational stability. The system reports a 30% boost in prosodic variety—covering emphasis, emotion, and rhythm—while cutting word‑error‑rate by roughly 40%, reducing truncations and mispronunciations that can break immersion. Multilingual coverage spans 15 major languages, and the dual cloning pathways let developers generate custom voices from as little as 15 seconds of audio or craft branded personas with longer recordings, expanding personalization possibilities without sacrificing quality.

From a business perspective, the pricing model—$5 per million characters for Mini and $10 for Max—translates to fractions of a cent per minute of speech, making continuous, high‑volume deployment financially viable. The dual deployment options, cloud‑hosted or on‑prem, address data‑sovereignty concerns while preserving performance parity. Integration hooks with platforms like LiveKit, Pipecat, and Vapi streamline end‑to‑end pipeline construction, positioning TTS‑1.5 as a turnkey solution for companies seeking to embed reliable, cost‑effective voice interaction at scale.

AI Pulse

Inworld AI Releases TTS-1.5 For Realtime, Production Grade Voice Agents

Companies Mentioned

Why It Matters

Key Takeaways

Pulse Analysis

Ask Pulse AI: