Judge Dismisses Most of Authors Claims Against OpenAI, Training Likely Fair Use

Is reading material copyright infringement? A judge in the OpenAI lawsuit doesn’t appear to think so after dismissing chunks of their claims.

If you personally read a copyrighted work, does the act automatically constitute copyright infringement? Obviously no, such an idea is completely absurd. Yet, that is what one lawsuit alleges. The only difference? It’s AI (Artificial Intelligence) reading that material and not a human. Some of the language used to describe the act may be confusing, but there is a real obvious case for fair use here. Specifically, that language involves “training” on copyrighted works. Yes, AI is generally far more efficient at this than humans, but it’s not really an argument against the practice in the end.

Another point of contention is whether creating summaries of a copyrighted work is fair use or not. In that case, well yeah, duh, of course such an act is fair use. Yet, the lawsuit against OpenAI contends that just because it is being done by an AI suddenly means that it is an act of copyright infringement.

Of course, the real reason why these lawsuits are being filed is because of an irrational fear of some sort of AI apocalypse where AI just takes over everything and makes writing novels, news articles, scripts, and other written forms of material strictly the domain of AI, putting everyone out of work in the process. Such fears are obviously overblown.

There are a number of examples today where AI has dominated human abilities. This includes Jeopardy (Watson), Chess (Deep Blue, Stockfish, AlphaZero), and Go (AlphaGo, Alphago Zero). Moreover, AI is also giving humans a run for their money in poker as well. With all of the above examples, they all have one thing in common: people still professionally play these games and even make a living off of all of the above. Just because computers can beat humans at something doesn’t mean that humans just stop playing all of the above games and players are suddenly unable to make money off of the activities.

What’s more, these are activities in which AI either dominates or is working on dominating humans over. Writing, on the other hand, isn’t even close to that level of expertise on the part of AI. There were efforts to replace lawyer jobs with AI such as DoNotPay. That ended badly. There were efforts to have ChatGPT write legal briefs for lawyers which, of course, ended badly. CNET tried replacing their news writing staff which ended badly. Gannett tried something similar which ended badly as well.

The thing is, there are numerous examples where AI was simply made to write content that sounds human-like. That’s it. It wasn’t designed to differentiate what is truthful or not. It was designed with the goal of writing something that sounds like it could have been written by a human. So, when it “hallucinates” something, that isn’t necessarily a fault within the machine learning that it does. It simply tried to write something that sounds like it could have been written by a human instead. When people are fooled by the hallucinations, then technically the language module succeeded. It’s precisely this reason why AI makes a terrible journalist, but would make a great translator from one language to another.

It is also for these reasons why I also consider the comments that AI will cause humanity to go extinct to be comically out of touch with reality. The idea of a glorified auto complete ending humanity doesn’t even pass the laugh test for me. Yet, some people out there act out in response to these irrational fears and it is probably how we got these lawsuits against OpenAI in the first place. A pair of lawsuits was apparently filed against OpenAI. One from the New York Times which we discussed earlier and the lawsuit by Sarah Silverman and multiple other authors. According to recent reports on the latter, large chunks of the lawsuit got dismissed by a judge. From the Hollywood Reporter:

A federal judge has dismissed most of a lawsuit brought by Sarah Silverman, Ta-Nehisi Coates and other authors against OpenAI over the use of copyrighted books to train its generative artificial intelligence chatbot, marking another ruling from a court questioning core theories of liability advanced by creators in the multifront legal battle.

U.S. District Judge Araceli Martinez-Olguin, in an order issued on Feb. 12, refused to allow claims for vicarious copyright infringement, negligence and unjust enrichment to proceed against the Sam Altman-led firm. Following in the footsteps of another judge overseeing an identical suit against Meta, Martinez-Olguin rejected one of the authors’ main claims that every answer generated by OpenAI’s ChatGPT is an infringing work made possible only by information extracted from copyrighted material.

The authors failed to cite “any particular output” that is “substantially similar — or similar at all — to their books,” the court explained. They were given leave to amend, meaning that they will have another chance to refile the suit. A claim for a violation of California’s unfair competition law was permitted to advance under the theory that the company’s use of copyrighted works to train its AI model for profit constitutes an unfair business practice. Notably, OpenAI didn’t move to dismiss a claim for direct copyright infringement.

The ruling builds upon findings from two other judges, also in the Northern District of California, who expressed skepticism as to whether creators can substantiate fundamental contentions in their suits in the absence of evidence of the AI tools generating answers that appear substantially similar to the works they are alleged to infringe upon. In a case between artists and AI art generators, U.S. District Judge William Orrick called the allegations “defective in numerous respects.”

If this article is anything to go by, the judge did the obvious thing and dismissed the complaints about copyright infringement. It was always an absurd concept that reading material and summarizing material for others was considered copyright infringement. Trying to say that if it’s done by a computer is somehow different is just begging for such a claim to by laughed out of the court system. I, admittedly, can’t really speak for the whole “unfair competition” aspect of this conclusively, though the concept does seem weird to me.

While there have been instances in the US legal system where bizarre rulings are made, this seems like a reasonable ruling. Unless something else crops up that says otherwise (As I don’t have access to the PACER system due to the fact that I’m a Canadian citizen), it seems like an, at least, partially reasonable ruling was made here.

Drew Wilson on Twitter: @icecube85 and Facebook.