Adobe Hit with Proposed Class-Action, Accused of Misusing Authors’ Work in AI Training

•December 18, 2025

TechCrunch AI•Dec 18, 2025

Companies Mentioned

Adobe

ADBE

Anthropic

Cerebras

CBRS

Apple

AAPL

Salesforce

CRM

Why It Matters

The lawsuit could set a legal precedent that forces AI developers to secure clear licensing for training data, raising compliance costs and reshaping how generative models are built.

Key Takeaways

•Adobe sued for using pirated books in SlimLM training.
•SlimLM built on SlimPajama, derived from RedPajama/Books3 dataset.
•Lawsuit represents growing class-action trend against AI firms.
•Potential damages could reshape AI data licensing practices.
•Anthropic settlement shows rising legal risks for AI developers.

Pulse Analysis

The Adobe case highlights a growing tension between rapid AI innovation and intellectual‑property law. SlimLM, marketed as a lightweight language model for mobile document assistance, relies on the SlimPajama‑627B dataset, which itself incorporates the RedPajama and Books3 collections. Those source corpora have been flagged for containing millions of copyrighted works harvested without author consent, a practice that has already drawn scrutiny in high‑profile suits against Apple and Salesforce. By alleging that Adobe’s model was trained on pirated material, the lawsuit amplifies concerns that many AI products may be built on legally questionable foundations.

For Adobe, the legal exposure extends beyond potential damages; it threatens the credibility of its AI portfolio, including the popular Firefly suite. Companies may now need to audit their data pipelines, implement stricter provenance tracking, and negotiate licensing agreements with rights holders. The cost of retrofitting compliance mechanisms could be substantial, especially for firms that have relied on open‑source datasets presumed to be safe. Moreover, the case could encourage regulators to issue clearer guidance on acceptable data sourcing, prompting a shift toward more transparent, consent‑driven training practices across the industry.

The broader market impact could be profound. As litigation mounts, investors and enterprise customers may demand higher assurance that AI solutions respect copyright, driving demand for models trained on licensed or synthetic data. This pressure could accelerate the development of alternative datasets, such as those generated through federated learning or public‑domain curation. Ultimately, the Adobe lawsuit underscores a pivotal moment where legal risk and ethical considerations are likely to shape the next generation of generative AI, compelling firms to balance innovation speed with responsible data stewardship.