AI Initiative Speaker Series: Generative AI and Copyright Law

Stanford Law School
Stanford Law SchoolJun 2, 2026

Why It Matters

Understanding how copyright law applies to AI training and output is crucial for companies to mitigate litigation risk and design compliant generative‑AI products.

Key Takeaways

  • Training AI on copyrighted data is generally deemed fair use.
  • Courts view AI training as transformative, not direct copying.
  • Model size influences memorization; larger models may reproduce copyrighted text.
  • Licensing markets for training data are emerging, challenging fair‑use assumptions.
  • Extracting verbatim excerpts from models can trigger copyright infringement claims.

Summary

Mark Lemley, a leading law professor, opened the AI Initiative’s lunch workshop by dissecting the intersection of generative AI and copyright law. He outlined three core legal questions: whether training AI on existing works infringes copyright, whether AI‑generated outputs can infringe, and who owns AI‑created content.

Lemley highlighted two Northern District of California rulings that treated AI training as fair use, emphasizing the transformative nature of creating a new model, the temporary and non‑public nature of copied data, and the lack of demonstrable market harm. He warned, however, that a nascent licensing market for training data could erode these defenses, especially as companies begin to negotiate bulk licenses.

Empirical research presented by Lemley and co‑author Cooper showed that model size matters: Llama 3.1 memorizes and can reproduce large passages of Harry Potter, while smaller models like Pythia do not. Techniques such as the “poem‑poem‑poem” attack can coax models into spitting out copyrighted text or personal information, underscoring the variability of memorization across models and datasets.

The discussion signals that firms deploying generative AI must evaluate data‑licensing strategies, implement robust output‑filtering safeguards, and monitor evolving case law. As courts refine fair‑use doctrine and licensing ecosystems mature, the legal risk profile for AI products will shift dramatically.

Original Description

AI Initiative Speaker Series
Generative AI and Copyright Law: Authorship, Fair Use, and the Future of Creativity
Stanford Law School | May 21, 2026
Sponsored by the Stanford Law School AI Initiative
Professor Mark Lemley examined the implications of generative AI for copyright law, with a focus on authorship, fair use, and creative ownership. The conversation explored how legal doctrine must evolve in response to machine-generated content and what the rise of AI means for the future of creative work.
The discussion was moderated by Professor Nate Persily.
#AI #CopyrightLaw #GenerativeAI

Comments

Want to join the conversation?

Loading comments...