OpenAI Responds to The New York Times Lawsuit: A Clarification of Intent and Practices

By Chris McKay January 8, 2024 • 3 min read

After being sued in December by The New York Times for allegedly scraping articles without permission to train its AI systems, AI startup lab OpenAI issued a strongly worded rebuttal today, rejecting the newspaper's legal claims and accusations.

In a statement published on its website, OpenAI defended its use of online content for model training as legally protected fair use. It also accused The Times of mischaracterizing rare instances of content regurgitation as widespread system flaws.

"We regard The New York Times' lawsuit to be without merit," OpenAI said. "Still, we are hopeful for a constructive partnership."

OpenAI outlined numerous collaborations with news outlets, including the AP and Axel Springer, to provide AI-assisted tools for journalists and explore mutually beneficial opportunities around real-time content distribution.

“Our goals are to support a healthy news ecosystem, be a good partner, and create mutually beneficial opportunities,” the company said.

This collaborative positioning stands in contrast to The Times’ confrontational legal action, which OpenAI said came unexpectedly after productive talks exploring potential partnerships.

Central to the dispute is whether scraping online content to train AI models constitutes legally protected fair use or copyright infringement. OpenAI argues precedent and public comments to the Copyright Office back the former view.

But it also provides publishers the ability to opt out of web scraping - something The Times did in August last year. OpenAI said respecting creators supersedes legal rights.

OpenAI strongly contested The Times’ implication that content regurgitation is a widespread issue in systems like ChatGPT. The company categorized verbatim reproductions as rare bugs that it continually works to minimize, not inherent system behaviors or substitutes for original reporting.

It also accused The Times of intentionally manipulating prompts to induce regurgitation in order to cherry pick examples and misportray the severity of the issue.

This response from OpenAI is not just a rebuttal to a legal claim but a commentary on the evolving relationship between AI technology and journalism. As AI becomes increasingly integral to various sectors, its interaction with existing laws and ethical standards is becoming more complex. OpenAI's statement reflects an attempt to navigate these waters responsibly while advocating for the transformative potential of AI in journalism.

The legal battle with The New York Times, however, is more than a singular dispute; it represents a critical juncture in the ongoing discourse on AI and copyright laws. The outcome could have far-reaching implications for digital content creation, the future of journalism, and the boundaries of fair use in the digital age.

This case underscores a broader challenge in the AI industry: balancing innovation with respect for intellectual property rights. OpenAI’s emphasis on collaboration and mutual benefit with news organizations highlights a proactive approach to addressing these challenges. However, the lawsuit by The New York Times illustrates the complexities and uncertainties surrounding AI's use of copyrighted material.

The issue of 'regurgitation' is particularly pertinent. While OpenAI assures that it is a rare occurrence and efforts are being made to eliminate it, the concern about AI systems inadvertently undermining journalistic integrity persists. This calls for a nuanced understanding of AI's capabilities and limitations, especially as AI models continue to evolve.

In its response, OpenAI extends an olive branch to The New York Times, expressing hope for a constructive partnership. This gesture indicates a willingness to engage in dialogue and find common ground, which could be beneficial for both the AI and journalism sectors. The potential for AI to support and enhance journalistic efforts is significant, and finding ways to do so without infringing on copyright laws is crucial.

As this legal battle unfolds, it's clear that the relationship between AI and journalism is at a crossroads. The outcome of this case will likely influence how AI technologies are developed and used in relation to copyrighted content. It's a pivotal moment that could redefine the interplay between technological innovation and the safeguarding of intellectual property in the digital realm.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.