Court Affirms AI Training as Fair Use - Science Techniz

Page Nav

HIDE

Grid

GRID_STYLE

Trending News

latest

Court Affirms AI Training as Fair Use

D.C. Court of Appeals Affirms Denial of Copyright Registration for AI-Generated Artwork In a landmark decision, a federal judge in Californi...

D.C. Court of Appeals Affirms Denial of Copyright Registration for AI-Generated Artwork
In a landmark decision, a federal judge in California has ruled that using copyrighted books to train AI models can qualify as legally transformative, meeting the bar for fair use. However, the ruling makes a sharp distinction between lawful acquisition of data and illicit copying from pirated libraries — the latter remains actionable and led to a settlement with AI firm Anthropic.

The case, Bartz v. Anthropic, saw Judge William Alsup deliver a nuanced verdict: while the act of training large language models on copyrighted works is indeed “exceedingly transformative” and supports fair use, the use of pirated books obtained from shadow libraries does not. As a result, while Anthropic prevailed on fair use in the training context, the lawsuit continues over their acquisition methods.

Specifically, the court held that copying books acquired by lawful purchase (or even digitizing owned print copies) for the purpose of model training qualifies as transformative and fair. In contrast, the retention of millions of pirated books in Anthropic’s dataset library was ruled infringing  Shortly after the ruling, Anthropic agreed to a staggering US $1.5 billion settlement to compensate authors and publishers whose works were used from piracy sources. Estimated at around $3,000 per title for roughly 500,000 affected works, this remains the largest copyright settlement in U.S. history 

Authors’ Reactions 

Authors and writers were outraged, as the ruling did not settle broader questions—particularly around compensation and consent. Many fear the ruling may embolden AI firms to train on copyrighted content without licensing, arguing transformation justifies such ingestion. Prominent authors’ groups, including the Authors Guild, warned that the ruling may “legitimize exploitation” of creative work while offering little recourse for individual writers.

Even those compensated by the settlement argue that one-time payments do not replace systemic licensing mechanisms. Authors of niche works worry their intellectual property may be used for decades of AI training with no ongoing remuneration, effectively collapsing the market for smaller or specialty titles.

Legal Implications

The technical and policy complexities are escalating. As articulated in legal insights, fair use remains intensely fact-specific. Judge Vince Chhabria’s similar fair use ruling in Kadrey v. Meta reinforced that fair use depends greatly on how the data was obtained and whether demonstrable economic harm can be shown (aalrr.com). Together, these rulings suggest courts are sympathetic to AI training as transformative but unwilling to excuse piracy.

Some legal experts point to parallels with Google Books v. Authors Guild, where scanning books to enable text search was found transformative. Yet, unlike Google’s snippet view, AI models can reproduce longer passages or stylistic imitations, raising concerns that the transformative use argument might not fully capture the risks of substitution or market erosion.

Industry Fallout

The $1.5 billion settlement highlights the growing financial risks of training on tainted datasets. AI companies are now under pressure to adopt clean data practices—relying on licensed content, synthetic data, or publicly available works. Some have already begun cutting deals with publishers, news outlets, and image libraries, anticipating stricter scrutiny in future cases. Startups, meanwhile, worry that legal uncertainty and licensing fees will favor deep-pocketed incumbents. If training data becomes tightly controlled, large players may secure exclusive licenses, potentially locking smaller labs out of high-quality corpora.

For lawmakers, the rulings raise urgent questions. Should Congress legislate a new licensing framework for AI training? Should authors be compensated through collective rights organizations, similar to how music royalties are managed? Or should the transformative use doctrine continue to shield AI developers? No clear consensus has emerged, but the debate is intensifying across Washington and Silicon Valley alike.

In summary: yes, training on privately and legally obtained books is likely fair use under today’s interpretation—but the use of pirated content remains a major liability. Anthropic’s billion-dollar payout proves that the economics of AI training are tied not just to GPUs and electricity, but also to intellectual property. As both AI capabilities and legal frameworks evolve, the debate over authorship, creativity, and compensation is far from over.

"Loading scientific content..."
"If you want to find the secrets of the universe, think in terms of energy, frequency and vibration" - Nikola Tesla
Viev My Google Scholar