NY judge orders OpenAI to hand over ChatGPT conversations in win for newspapers in copyright case

posted in: All news | 0

A Manhattan judge has ordered OpenAI to provide the Pioneer Press and other news outlets with millions of anonymous chats between ChatGPT and its users in a major ongoing copyright infringement case.

In a nine-page order made public Wednesday, Manhattan Magistrate Judge Ona Wang denied OpenAI’s request to reconsider her November ruling requiring the tech giant to hand over 20 million ChatGPT output logs to the media outlets.

The newspapers want to analyze a sample of ChatGPT’s consumer logs to test its language-learning model to see whether and how it’s propagating journalists’ work.

The ruling comes in a major consolidated class-action lawsuit against Microsoft and OpenAI initiated in 2023, in which The New York Times and news outlets affiliated with Tribune Publishing and MediaNews Group allege the artificial intelligence company is stealing and distorting their copyrighted works. The Authors Guild and a litany of best-selling writers are also parties in the complex litigation.

“OpenAI’s leadership was hallucinating when they thought they could get away with withholding evidence about how their business model relies on stealing from hardworking journalists, and we look forward to holding them accountable for their ongoing misappropriation of our work. They should pay for the copyright-protected work they use to build and maintain their apps and products, and they know it,” said Frank Pine, executive editor of MediaNews Group and Tribune Publishing, in a statement.

Spokespeople and lawyers for OpenAI did not immediately respond to requests for comment, but a spokesman for Open AI pointed Reuters to a company blog post that said the request to turn over the chats “disregards long-standing privacy protections” and “breaks with common-sense security practices.”

But in the decision, Wang reaffirmed that users’ privacy was not in jeopardy, noting that OpenAI had almost completed an internal process to anonymize the chats. Also mitigating privacy risks raised by OpenAI are the “multiple layers of protection in this case precisely because of the highly sensitive and private nature of much of the discovery that is exchanging hands,” the judge wrote.

She found the A.I. conversations were “clearly relevant” to the news outlets’ claims that they contain partial or complete reproductions of their copyrighted works and to OpenAI’s defense that they contain other user activity.

“News Plaintiffs are entitled to discovery on both,” the judge wrote.

“Production of the 20 Million ChatGPT Logs is also proportional to the needs of the case. The total universe of retained consumer output logs is in the tens of billions. The 20 million sample here represents less than 0.05% of the total logs that OpenAI has retained in the ordinary course of business.”

Wang wrote that once the tech behemoth has completed the deidentification process, it will have 7 days to hand over the data. OpenAI has also appealed Wang’s November order to Manhattan Federal Judge Sidney Stein, the district judge overseeing the case.

The sizable 20 million chats represent a fraction of the billions of output logs ChatGPT has retained, Wang noted in her Wednesday order.

Steven Lieberman, an attorney for MediaNews Group and Tribune Publishing, in a statement pointed to Wang’s finding that OpenAI had withheld “critically important evidence” when it was first requested and rejected the tech giant’s arguments to the contrary.

“The Court also raised the issue of whether OpenAI’s efforts to delay production of the ChatGPT logs was motivated by an improper purpose, saying of the two possible explanations for OpenAI’s behavior: [n]either bode well for OpenAI.”

Leave a Reply

Your email address will not be published.