Legal Videos

All News Deals Social Blogs Videos Podcasts Digests

Legal AI LegalTech

Is RAG Dead? Not If Accuracy Matters [Alex Bowcut] - 769

•June 9, 2026

TWiML AI (This Week in Machine Learning & AI)

TWiML AI (This Week in Machine Learning & AI)•Jun 9, 2026

Why It Matters

For regulated businesses, auditability and exact sourcing trump raw language-model fluency — meaning RAG-based systems that surface citations will remain necessary to manage legal risk and enable scalable international expansion. Firms that rely solely on large-context LLMs risk compliance errors and slower regulatory adoption without provable provenance.

Summary

As large context windows expand, Alex Bokeut of Sphere argues retrieval-augmented generation (RAG) remains essential for high-stakes, accuracy-sensitive domains like sales tax and VAT compliance. Sphere built TRAM, a document-centric system that combines retrieval, OCR and expert workflows to speed tax review nearly two orders of magnitude while preserving precise citations and provenance. Bokeut says pure LLM ingestion risks losing verifiable source links critical for legal and regulatory answers, and that messy, heterogeneous government documents still require targeted retrieval and human-in-the-loop validation. Sphere has scaled this approach through engineering and raised a Series A from a16z to tackle global tax complexity.

Original Description

As context windows grow into the millions of tokens, many AI practitioners are questioning whether retrieval-augmented generation (RAG) is still necessary. If modern models can ingest entire libraries of documents, why bother with retrieval at all?

In this episode, Alex Bowcut, Head of Engineering at Sphere, explains why the answer depends on the application. Sphere uses AI to automate global tax compliance—an environment where getting the answer right isn’t enough. Every conclusion must be backed by the correct legal citation, and every decision must withstand expert review.

We explore how Sphere built TRAM (Tax Review and Assessment Model), a production AI system that combines retrieval, reasoning models, legal review workflows, reinforcement learning, and deterministic systems to help tax experts move nearly two orders of magnitude faster while maintaining accuracy.

Along the way, we discuss why RAG remains critical in high-stakes domains, how Sphere processes legal and regulatory documents from jurisdictions around the world, retrieval architectures, semantic chunking, dense versus sparse retrieval, expert feedback loops, and the challenges of building AI systems that people can actually trust.

🗒️ Full show notes: https://twimlai.com/go/769.

🔔 Subscribe to our channel for more great content just like this: https://youtube.com/twimlai?sub_confirmation=1

📖 CHAPTERS

===============================

00:00 Intro: Is RAG Obsolete?

01:24 Meet Sphere and TRAM

02:02 Why Tax Content Is Hard

03:54 How AI Supercharges Tax Experts

05:14 Alex Background Story

07:03 Messy Legal Data Ingestion

08:58 How TRAM Review Works

11:19 What Triggers Updates

15:13 Legal Review Not Labeling

16:08 Chunking and Indexing Law

21:21 Dense Versus Sparse Search

25:07 Taxonomy Driven Queries

27:55 RAG Is Dead Debate

29:55 Citations and Traceability

31:22 RFT for Accuracy Gains

34:50 Evals and Model Drift

37:47 LLM Reranking and Expansion

40:28 Chasing Nines Accuracy

42:39 Context Windows Impact

44:46 Costs and Latency Reality

45:40 Future Roadmap for TRAM

48:57 Personal AI Workflow Tools

50:30 Closing

🗣️ CONNECT WITH US!

===============================

Subscribe to the TWIML AI Podcast: https://twimlai.com/podcast/twimlai/

Follow us on Twitter: https://twitter.com/twimlai

Follow us on LinkedIn: https://www.linkedin.com/company/twimlai/

Join our Slack Community: https://twimlai.com/community/

Subscribe to our newsletter: https://twimlai.com/newsletter/

Want to get in touch? Send us a message: https://twimlai.com/contact/

Comments

Want to join the conversation?

Loading comments...