Seance 2.0 could democratize video creation, giving businesses and creators rapid, low‑cost access to cinematic‑quality content while reshaping the economics of media production.
ByDance unveiled Seance 2.0, an AI‑driven video generation engine that lets users create short films using only prompts, images, audio, or existing clips. The platform combines text, image, audio, and video inputs into a single unified multimodal architecture, allowing up to nine still images, three video clips, and three audio tracks to be blended with natural‑language instructions.
The company claims the new model delivers markedly better motion stability, physical realism, and controllability than its predecessor, Seance 1.5. In benchmark tests it outperformed rivals such as Soro 2 Pro and VO3.1 on text‑to‑video, image‑to‑video, and mixed‑modal tasks, achieving higher scores for motion quality, audio‑visual sync, and overall performance. It can generate 15‑second multi‑shot sequences with dual‑channel audio and offers granular director‑level adjustments to lighting, shadows, and camera movement.
A standout feature highlighted by ByDance is the “director‑level control” interface, which lets users fine‑tune performance, lighting, and camera paths in real time. The model also supports joint generation of audio and video, enabling synchronized soundtracks without post‑production editing. However, the team acknowledges lingering issues with fine‑detail stability, hyper‑realistic rendering, and accurate lip‑sync for multiple speakers.
If the technology matures, it could lower the barrier to high‑quality video production, allowing marketers, educators, and independent creators to produce cinematic content without costly equipment or crews. The rollout signals a shift toward AI‑centric creative pipelines and may pressure traditional production studios to adopt similar tools.
Comments
Want to join the conversation?
Loading comments...