OpenAI: ChatGPT Output Preservation ‘Unprecedented’ Privacy Violation

·

(May 27, 2025, 2:11 PM EDT) -- SAN FRANCISCO — Requiring preservation of ChatGPT outputs users wish to delete simply so news plaintiffs in a copyright suit can secure a litigation advantage constitutes an “unprecedented” privacy violation and sets a “dangerous precedent,” OpenAI entities tell a federal court in California in a May 23 supplemental opposition after a magistrate judge ordered the preservation and denied a motion for reconsideration.

(In re OpenAI ChatGPT Litigation, No. 25-3143, N.D. Calif.)

(OpenAI’s supplemental brief on output preservation available.  Document #46-250604-073B.)

Daily News LP; Chicago Tribune Co. LLC; Orlando Sentinel Communications Co. LLC; Sun-Sentinel Co. LLC; San Jose Mercury News LLC; DP Media Network LLC; ORB Publishing LLC; and Northwest Publications LLC (Daily News plaintiffs), The New York Times Co. and Center for Investigative Reportion (all collectively, news plaintiffs) filed suits challenging OpenAI entities’ use of copyrighted material to train artificial intelligence.  The defendants are OpenAI Inc., OpenAI LP, OpenAI OpCo LLC, OpenAI GP LLC, OpenAI Startup Fund GP I LLC, OpenAI Startup Fund 1 LLC and OpenAI Startup Fund Management LLC (collectively, OpenAI).  The suits were eventually consolidated and then transferred to multidistrict litigation in the U.S. District Court for the Southern District of New York.  All told the MDL came to include 12 actions.

In January, the news plaintiffs notified the court about OpenAI’s potential deletion of output log data.  At a Jan. 22 conference, the court denied the plaintiffs’ request for wholesale preservation of output log data but inquired whether there was any way to segregate savable outputs from ChatGPT users who wished to have their data deleted or, in the alternative, to anonymize the data to address privacy concerns.

Data Preservation

The news plaintiffs filed a letter brief renewing their request that OpenAI preserve output log data.  In an opposition, OpenAI complained that the news plaintiffs sought retention of not only what federal law requires but of all data regardless of relevancy.

In an order, Magistrate Judge Wang directed OpenAI to preserve and segregate all output log data that would otherwise be deleted.  Magistrate Judge Wang noted that while the hearing focused on the fact that many billions of conversations are preserved, in their supplemental brief the plaintiffs contend that OpenAI is deleting a significant volume of conversations.  Even in its most recent filings OpenAI has not indicated whether it has taken any steps to preserve data or even whether it could, Magistrate Judge Wang said.

In a separate ruling, Magistrate Judge Wang said she would hold a status conference regarding potential spoliation motions involving the deletion of output logs and the parties could file supplemental briefing on the matter.

Both orders were issued May 13.

U.S. Magistrate Judge Ona T. Wang of the Northern District of California denied reconsideration or modification in a May 16 order in the wake of the May 15 motion.  On May 16 Jason Bramble, as president of Spark Innovations Corp., moved to intervene for the limited purpose of opposing preservation of all ChatGPT outputs.

Challenges

OpenAI moved for reconsideration or modification, arguing that the ruling requires it to “disregard legal, contractual, regulatory, and ethical commitments to hundreds of millions of people, businesses, educational institutions, and governments around the world—even though there is no reason to believe these drastic measures will advance this litigation.  To be clear: OpenAI is taking the steps it can to comply with the Order despite the many practical and engineering challenges compliance entails.  But we urge the Court to reconsider the Order, both to correct a manifest injustice and because the Order is based on material misrepresentations in Plaintiffs’ recent letter brief,” OpenAI said.

Magistrate Judge Wang denied reconsideration without prejudice to renewal.  OpenAI has not shown that reconsideration would compel a different outcome, Magistrate Judge Wang said.

As to its request for modification, Magistrate Judge Wang said the order in part was due to the fact that OpenAI had not yet responded to the January 2025 inquiry about what steps could be taken to preserve output data.  OpenAI’s recent declaration that it is taking steps to preserve output allows the court to focus on relevance and proportionality questions, Magistrate Judge Wang said.

Magistrate Judge Wang then said that at the Jan. 22 conference she specifically addressed the argument that the deleted outputs would not deviate from the representative sample OpenAI has already produced.  The problem is that it’s possible that some ChatGPT user got around the restrictions and was able to access paywalled material, and having learned of the lawsuit against OpenAI requested deletion of the outputs, Magistrate Judge Wang said. 

Magistrate Judge Wang also found that there were open questions about the proportionality of requiring preservation of the outputs.  OpenAI has partially addressed the issue by directing the court to the technical problems and privacy considerations output preservation implicates.  But the parties can meet and confer on these issues.  At the time, the issues cited by OpenAI do not require modification of the order, Magistrate Judge Wang said.

Opposition

In the supplemental opposition, OpenAI says “While nothing could justify such a blatant affront to user privacy, News Plaintiffs’ stated rationale is particularly weak.  In effect, News Plaintiffs ask the Court to override the decisions of OpenAI’s users in order to facilitate a dragnet for evidence there is no reason to believe exists, much less in any material amount.”

Instead of blanket preservation, OpenAI says it has posed a way forward that allows for preservation of potentially relevant outputs, which the plaintiffs rejected.  OpenAI says the question isn’t whether OpenAI wants the records destroyed, but whether its users want the records destroyed.  OpenAI says that it is “strongly committed to user privacy” and that those users should have a say in what happens to their data.  OpenAI notes that ChatGPT outputs include everything from the mundane like household budgets to the personal like wedding vows and the highly confidential, such as business plans and information.

OpenAI says that it has already taken steps to comply with the court’s ruling but that full compliance “will require OpenAI to undertake a massive project to overhaul and rebuild elements of its core data infrastructure” and complicates efforts to meet data privacy rules and regulations around the world.

Counsel

The news plaintiffs are represented by Steven Lieberman of Rotwell, Figg, Ernst & Manbeck PC in Washington, D.C., Ian B. Crosby of Susman Godfrey LLP in Seattle and Matt Topic of Loevy & Loevy in Chicago. 

OpenAI is represented by Joseph C. Gratz of Morrison & Foerster LLP in San Diego, Edward A. Bayley of Van Nest & Peters LLP in San Francisco and Elana Nightingale-Dawson of Latham & Watkins LLP in Washington.

(Additional documents available: Bramble’s motion to intervene.  Document #46-250604-074B.  Order on reconsideration and modification.  Document #46-250604-042R.  OpenAI’s letter motion.  Document #46-250604-043M.  Order on output preservation.  Document #46-250604-024R.  Spoliation order.  Document #46-250604-027R.  News plaintiffs’ letter.  Document #46-250604-025B.  OpenAI’s letter.  Document #46-250604-026B.  MDL transfer order.  Document #46-250604-028R.)