Federal Judge Says AI Training on Copyrighted Books Is Fair Use

Federal Judge Says AI Training on Copyrighted Books Is Fair Use

For the first time, a U.S. court has ruled that it’s legal for an AI company to train large language models on copyrighted books without the authors’ permission—at least under certain conditions.

Key Points:

  • Judge Alsup says training AI on copyrighted books is a “spectacularly” transformative fair use.
  • Digitizing purchased books for internal use also cleared as fair use.
  • But using pirated books to build a “forever” library is not protected and heads to trial.

In a closely watched decision, Judge William Alsup sided with Anthropic on the most explosive question in the copyright wars around AI: whether training a large language model on copyrighted books is legal. His answer? Yes—so long as the books were lawfully acquired and used specifically for training.

The ruling, issued Monday in federal court in San Francisco, marks a major win for generative AI companies. Alsup concluded that Anthropic’s use of copyrighted texts to train its Claude models was “quintessentially transformative,” comparing the models to humans learning how to write by reading widely. The models didn’t regurgitate the books verbatim, and no infringing outputs were alleged, he emphasized.

But the court didn’t let Anthropic off the hook entirely.

In blistering language, Alsup slammed the company for pirating millions of books from sites like LibGen and Books3, quoting internal emails that described the practice as a shortcut to avoid a “legal/practice/business slog.” Those copies, he ruled, were used to build a central “general-purpose” library—distinct from training—and that use was not protected under fair use. A trial will now decide damages, including whether Anthropic’s actions were willful.

Alsup’s decision draws a clear legal line. If you buy the books and use them solely for training a model, that’s fair game. If you steal them and keep them indefinitely “just in case,” you’re still on the hook.

The court also blessed Anthropic’s practice of destructively scanning millions of print books it had legally purchased. Though the company turned those into digital copies for storage and searchability, Alsup ruled that the format shift was a fair use too—likening it to converting paper to microfilm or DVR recordings of TV shows.

What makes this ruling seismic isn’t just the verdict—it’s the clarity. In an industry flooded with lawsuits from authors, publishers, and news orgs, this is the first federal ruling to greenlight large-scale AI training under fair use. It provides a roadmap for AI companies: buy your data, document your uses, and don’t pretend piracy is a business model.

But the ruling doesn’t end the debate. It doesn’t apply to models that replicate copyrighted works in outputs. It doesn’t answer whether other media—like art or music—would be treated the same way. And it definitely doesn’t stop Congress from stepping in with new rules.

Still, for now, Judge Alsup’s message to AI companies is clear: read all you want—but pay for the book first.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.

Subscribe