
Talk Python to Me
#540: Modern Python Monorepo with Uv and Prek
Why It Matters
Understanding how a high‑traffic project like Apache Airflow organizes its codebase offers actionable patterns for any team dealing with large, interdependent Python projects. The episode highlights the shift toward modern packaging tools and the challenges of AI‑driven contributions, making it especially relevant for developers aiming to improve code quality, security, and collaboration in today’s fast‑moving open‑source ecosystem.
Key Takeaways
- •Airflow monorepo exceeds one million Python lines, hundred packages.
- •UV and pyproject.toml manage dependencies across 140 sub‑projects.
- •310 weekly PRs, 40 daily, require strict review automation.
- •AI‑generated contributions increase volume, demand new quality guidelines.
Pulse Analysis
In this episode the hosts dive into Apache Airflow’s massive Python monorepo, a codebase that now tops one million lines of Python across more than a hundred internal packages. By adopting modern tooling such as UV and the standardized pyproject.toml configuration, the Airflow team can orchestrate dependency resolution, build isolation, and release cycles for each sub‑project without fragmenting the repository. This approach demonstrates how a single Git repo can remain coherent while still supporting independent development streams, a model that many large‑scale Python organizations are beginning to emulate.
The conversation then shifts to the operational realities of maintaining such a sprawling codebase. Airflow processes roughly 310 pull requests each week—about 40 per day—requiring a sophisticated review pipeline, extensive automated checks, and a disciplined community of PMC members. Recent spikes in AI‑generated contributions have added both speed and noise, prompting the team to publish new contribution guidelines and tighten security vetting to protect the supply chain. These measures illustrate how high‑velocity open‑source projects balance rapid innovation with rigorous quality control.
Finally, the guests explain how the Apache Software Foundation’s merit‑based governance and Astronomer’s commercial backing create a sustainable ecosystem for Airflow. The ASF’s emphasis on community over code ensures that decisions remain developer‑driven, while Astronomer contributes resources and real‑world scaling experience. Listeners walk away with actionable patterns—monorepo structuring, modern packaging, and governance best practices—that can be applied to any large Python project seeking both flexibility and stability.
Episode Description
Monorepos -- you've heard the talks, you've read the blog posts, maybe you've seen a few tantalizing glimpses into how Google or Meta organize their massive codebases. But it's often in the abstract and behind closed doors. What if you could crack open a real, production monorepo, one with over a million lines of Python and over 100 of sub-packages, and actually see how it's built, step by step, using modern tools and standards? That's exactly what Apache Airflow gives us.
Episode sponsors
Agentic AI Course
Python in Production
Talk Python Courses
Links from the show
Guests
Amogh Desai: github.com
Jarek's GitHub: github.com
definition of a monorepo: monorepo.tools
airflow: airflow.apache.org
Activity: github.com
OpenAI: airflowsummit.org
Part 1. Pains of big modular Python projects: medium.com
Part 2. Modern Python packaging standards and tools for monorepos: medium.com
Part 3. Monorepo on steroids - modular prek hooks: medium.com
Part 4. Shared “static” libraries in Airflow monorepo: medium.com
PEP-440: peps.python.org
PEP-517: peps.python.org
PEP-518: peps.python.org
PEP-566: peps.python.org
PEP-561: peps.python.org
PEP-660: peps.python.org
PEP-621: peps.python.org
PEP-685: peps.python.org
PEP-723: peps.python.org
PEP-735: peps.python.org
uv: docs.astral.sh
uv workspaces: blobs.talkpython.fm
prek.j178.dev: prek.j178.dev
your presentation at FOSDEM26: fosdem.org
Tallyman: github.com
Watch this episode on YouTube: youtube.com
Episode #540 deep-dive: talkpython.fm/540
Episode transcripts: talkpython.fm
Theme Song: Developer Rap
🥁 Served in a Flask 🎸: talkpython.fm/flasksong
---== Don't be a stranger ==---
YouTube: youtube.com/@talkpython
Bluesky: @talkpython.fm
Mastodon: @talkpython@fosstodon.org
X.com: @talkpython
Michael on Bluesky: @mkennedy.codes
Michael on Mastodon: @mkennedy@fosstodon.org
Michael on X.com: @mkennedy
Comments
Want to join the conversation?
Loading comments...