Shipmas Day 2: Podcast Anything App (Gemini 3)

All About AI
All About AIDec 6, 2025

Why It Matters

This lowers the friction for repurposing written content into polished audio, enabling creators and businesses to produce multi-voice podcast episodes rapidly for distribution or accessibility. It illustrates how LLMs plus multispeaker TTS can automate content transformation and editing workflows at scale.

Summary

The developer built a web app that converts uploaded documents (PDFs, markdown, text) into multi-voice podcast episodes by using Gemini 3 to generate scripts and a multispeech TTS API to produce audio. The interface offers controls for tone (roast, steelman, explain like a fifth grader), length ranges, and voice selection, and displays a timeline with download/playback options. In demos the tool turned a credit-card statement into a two-voice comedic roast and summarized a robotics benchmark paper as a short, kid-friendly explainer. The project was scaffolded with a Python TTS backend and a Next.js-style frontend, using Gemini API docs and an agent-driven planning step to assemble features quickly.

Original Description

Shipmas Day 2: Podcast Anything App (Gemini 3)
👊 Become a YouTube Member to Support Me:
My AI Video Course:
🔥Open GH:
Business Inquiries:
kbfseo@gmail.com

Comments

Want to join the conversation?

Loading comments...