Thetechlawworld: June 2024

AI, COPYRIGHT AND US LEGAL SYSTEM

-Rishiraj Chandan

LLM’s AND GENERATIVE AI

Hundreds of millions of individuals use ChatGPT and other generative systems, demonstrating the widespread passion around generative AI. Businesses are attempting to determine how to use generative AI, and some claim that GPT-4 or ChatGPT show signs of exhibiting artificial general intelligence. But not everyone else is as enthusiastic or upbeat about generative AI. One case challenges the legality of Codexis[1], which is allegedly illegal, and Copilot, a tool that recommends code in response to programmer requests. The complaint is filed against GitHub[2], Microsoft[3], and OpenAI.

Stability AI is facing two lawsuits in the US that contest the legitimacy of stable diffusion[4]. The plaintiff in the Copilot lawsuit is seeking injunctions to shut down these programs in addition to $9 billion. The two main points of contention are that consuming copyrighted content via the internet or other sources infringes on the copyright of the original works, and the results—whether they be source code, text, photos, or music—also violate derivative works. The U.S. Copyright Office is seeking public opinion on this matter. Some respond, "Oh, it's fair use and no problem," while others respond, "Oh, massive piracy and we got to shut this stuff down." Unfortunately, more people share this response to generative AI than you may think.

“The Future of Life” Open Letter has called for a six-month halt on AI research, reflecting a larger moral panic around the technology. It discusses the grave dangers facing mankind and civilization and urges preparation and the establishment of rules. Law and politics are heavily stressed in this field. One statement on generative AI referred to it as a "Marxist nightmare" as it benefits capitalist owners who get no compensation for the millions of labour hours produced.

Meanwhile, conferences concerning the best course of action for generative AI and AI systems in general, as well as the nature of appropriate regulations, are taking place around Europe. The three main problems that are addressed are:

a. does it violate copyright to use works as training data for generative AI systems?

b. when do AI-generated outputs violate derivative works?, and

c. who is the rightful owner of the copyright in computer program outputs that include copyright material?

The dispute, which date back to the mid-1960s, centres on the question of copyright as it relates to artificial intelligence (AI). Works of authorship are protected by copyright law for the duration of the author's life plus an additional 70 years, or 95 years in the case of corporate writers, starting from the time they are first fixed in a tangible medium. The only authority to manage copies, distributes, creation of derivative works, public performances, and exhibitions is granted to authors.

Authors' ideas, facts, and techniques are not protected by copyright; only the creative expression in their works of authorship is. The ingesting problem is related to the limitations on copyright exclusive rights imposed by fair use and other doctrines. Fair uses of works protected by copyright are protected against infringement claims in the United States and are not regarded as copyright infringements. When determining whether a use is fair, courts take into account four factors: the nature and purpose of the challenged use; criticism; commentary; news; teaching; research; scholarship; and the difference between non-commercial and commercial usage.

While factual and utilitarian works have a limited scope of fair use and a narrower extent of protection against copyright, artistic and imaginative works receive a greater scope of protection[5]. Other factors taken into account include the volume and significance of the taking and how the problem affected the work's worth or the market for it.

As shown by the rulings in Field against Google[6] and Authors Guild v. Google[7], there are several examples that imply using the internet to crawl works as training data may be considered fair use. Since Google wasn't abusing the language in the work, its digitization of millions of in-copyright volumes from research library holdings was deemed fair use in these situations. On the other hand, others argue that Google's attempts to facilitate the discovery of the copyright owner's works are not protected, and so they oppose the intake of training data.

The conversation concludes by emphasizing how crucial it is to comprehend fair use and copyright regulations in connection to AI, especially when it comes to AI intake.

INGESTING WORKS AS TRAINING DATA

In the context of generative AI, the controversy about fair use of AI is covered in the book. It's been suggested that generative AI produces better results when it consumes valuable content, and that the carefully chosen works of authorship need to be compensated for. According to a study conducted by the Authors Guild, 90% of its writers think that AI developers need to compensate them for including their works into the training data for extensive language models[8].

If there is a market for training data licensing, there may be a damage to that specific market as well. Fair use analysis takes this into account. There are opposing factors, however, such as the constitutional goal of copyright, which is to further scientific knowledge. One may argue that generative AI systems further this goal, and fair usage gives some possibility for innovation.

Since a legislation defines what it means to declare that an author has the sole right to make derivative works, the derivative work right debate is crucial. In order for a second work to violate the rights of a first work's derivative work, the author of the second work must have used a significantly comparable level of creative expression from the first work. The majority of generative AI outputs won't resemble the input data from the training data series in a meaningful way. It is doubtful that the outputs would violate that right if such is the case.

One issue with generative AI is that if a picture is shown often in ingesting works, the huge language model can get memorized and cause an infringement claim[9]. People are considering methods to eliminate duplicates from the training data, which is less likely to occur, in order to prevent infringement. Others are attempting to use output filters to stop the creation of derivatives that violate intellectual property rights.

SUITS FOR COPYRIGHT INFRINGEMENT

There are now two instances in the U.S. against Stability AI, the most compelling of which is Getty. Getty claims that Stability[10] downloaded 12 million images from Getty Images, including with captions, alleging violations of copyright. In addition, they allege trademark infringement. Their lawsuit includes screenshots of various Stable Diffusion outputs that Getty contends violate derivative rights. While Getty is open to licensing the use of photos from its database for instructional purposes, it takes issue with Stability's egregious violation.

Copyright attorneys have been arguing about copyright ownership of computer-generated program outputs. Owner of a computer built using generative AI software Stephen Thaler brought a picture produced by his system to the Copyright Office and asked for a registration certificate. The Copyright Office, however, turned down his plea, claiming that there was no human authorship on the photograph. Then, he filed a lawsuit against the copyright registrar, requesting an order compelling the registrar to provide him a registration certificate. This lawsuit is still continuing in Washington, D.C. federal court.

In addition, Kris Kashtanova, who creates photographs using Midjourney[11], filed a copyright lawsuit with the Copyright Office. Although Kashtanova received a registration certificate from the Office, she subsequently discovered that her AI-generated photographs were protected by copyright. Instead of allowing Kashtanova to obtain copyright in the text and the arrangement, selection, and selection of the photographs, the Office cancelled the registration. However, U.S. copyright law does not cover any of the photographs, therefore the registration was altered.

AI-generated works, according to a policy statement released by the Copyright Office, lack human authorship, belong to the public domain, and may be freely copied. You cannot prevent someone else from utilizing material that contains AI-generated text, photos, or other content just because you have applied to register copyright for it. Once it is in the public domain, it cannot be taken down. To sum up, the United States is struggling with the question of who owns the copyright on computer-generated software and the possibility of infringement lawsuits. It will be essential to address these concerns and make sure that copyright rules are respected as the case develops.

The absence of human authorship in AI-generated works does not prevent the Copyright Office from registering them, even when the outputs violate plagiarism. The Office is not refusing registration to AI-generated works because the outputs violate derivatives, which is excellent news for stability. Large language models and training data, on the other hand, might present the same issue since they are the result of automated procedures rather than creative works. There are still unresolved issues surrounding generative AI, such as the need of adopting regulations specifically tailored to this field. The AI Act, which will place significant obligations on AI developers and businesses that use such systems, is finalized by the European Union[12] and will vary in severity according on the risk involved in the deployment. The AI Act was created with certain kinds of AI systems in mind, including general-purpose AI and healthcare systems.

US LEGAL SYSTEM AND AI

Another wrench in the works is generative AI, and at the moment, the United States has few specific restrictions. Although there is an AI Bill of Rights and a framework for risk management of AI systems released by NIST, they are only a collection of broadly applicable guidelines. The only legislation that has the power to destroy generative AI systems is copyright law. According to courts, ingestion, infringing, and the whole thing may essentially be destroyed, therefore copyright law may pose an existential danger to advancement in this specific area.

There are worries over the possibility of generative AI being used in movies as the Copyright Office is not protecting anything created with AI. The reason the Office refuses to register AI-generated works isn't because the results violate derivatives; rather, it's because the works don't have human authorship[13]. In summary, years will pass before courts are able to definitively resolve issues pertaining to generative artificial intelligence. Global discussions regarding AI governance are taking place in major cities throughout the globe, and it is important to pay attention to this developing topic in order to develop arguments that it will strengthen copyright rather than undermine it.

The employment of AI in a variety of businesses, including the Copyright Office, is the topic of debate. The fact that human owners choose which of these technologies to use to a considerable extent raises questions regarding the ramifications for businesses like Disney. Disney is reluctant to allow third parties to utilize its computer-generated material outside of the film because they don't want to come seen as dangerous. Although the Copyright Office has been addressing this matter for years, it is not an expert in this field. Through some of its incompetent choices, the present copyright office has been dictating American industrial strategy. This is troubling since it doesn't take into account what this means for companies like Disney.

How to discern between textual and graphical inputs is one of the difficulties in detangling text and data prompts in AI-generated work. The topic of "Copilot" in relation to AI systems and the results of a cooperation in which AI acts as your co-pilot throughout the development process are also discussed. Given that Europe is also heading in this route, this may sustain copyright in these areas.

CONCLUSION

Let’s consider whether web-crawling may be covered by copyright after learning about the web crawling case. Is it possible to argue that some of the images used in the Stability AI case are private even though public data is being indexed? The discussion emphasizes how important it is to pay more attention to how AI-generated material affects copyright offices and the possible legal ramifications of such acts. The conversation concludes by emphasizing how crucial it is to comprehend the subtleties of AI-generated work and the possible legal ramifications of such activities. It also emphasizes how important it is to pay more attention to the copyright office's concerns over AI-generated material as well as any possible legal ramifications of such activities.

The idea of copyright is discussed along with how using search engines is affected by it. It is proposed that new guidelines be implemented in order to shield users' material from unwanted access. Breaking over a paywall, which would stop fair usage, is one possible problem. However, in the instance of Authors Guild v. Google, copies of 27,000 volumes from a private research library collection of millions of books were made. Google is superior than Bing because of its enormous volume of data, having scanned millions of books to form the Google Books corpus.

The topic of whether the human generation criterion that courts now use is codified at the legal level should also be discussed. Even while it seems logical, things aren't always like that. Despite being more than 300 pages lengthy, the copyright act itself lacks precise guidelines for some situations. The section outlining exclusive rights is probably just twenty-five sentences long, and it is hardly a trustworthy indicator of future events.

The subject of whether utilizing tools for internal work exposes one to copyright infringement if anything is done there, such writing or copying for an advertisement, is also discussed. The person who created the outputs is as accountable for copyright infringement as the person who designed the program that generated them if the outputs violate derivative works. This is because damages may be assessed or paid for regardless of whether the output is utilized as an advertisement or as art.

I conclude the conversation by emphasizing how crucial it is to understand the copyright system and how it affects consumers. It's critical to understand that copyright is a strict liability framework and that using someone else's work for profit does not always put users at risk of infringement.

[1] Case No. 3:16-cv-00826-WHO (N.D. Cal. Dec. 4, 2017)

[2] Class action against GitHub Copilot [LWN.net]. (n.d.). https://lwn.net/Articles/914150/

[3] OpenAI and Microsoft face fresh lawsuit from US news organisation. (2024, June 28). Artificial Intelligence | World IP Review. https://www.worldipreview.com/artificial-intelligence/openai-and-microsoft-face-fresh-lawsuit-from-us-news-organisation

[4] Hillemann, D. (2023, January 23). AI-Related Lawsuits: How The Stable Diffusion Case Could Set a Legal Precedent. Fieldfisher. https://www.fieldfisher.com/en/insights/ai-related-lawsuits-how-the-stable-diffusion-case

[5] Module 3: The Scope of Copyright Law - Copyright for Librarians. (n.d.). https://cyber.harvard.edu/copyrightforlibrarians/Module_3:_The_Scope_of_Copyright_Law

[6] 412 F. Supp. 2d 1106 (D. Nev. 2006)

[7] Authors Guild v. Google, Inc., No. 13-4829 (2d Cir. 2015). (2015, October 16). Justia Law. https://law.justia.com/cases/federal/appellate-courts/ca2/13-4829/13-4829-2015-10-16.html

[8] Survey Reveals 90 Percent of Writers Believe Authors Should Be Compensated for the Use of Their Books in Training Generative AI - The Authors Guild. (2023, May 15). The Authors Guild. https://authorsguild.org/news/ai-survey-90-percent-of-writers-believe-authors-should-be-compensated-for-ai-training-use/

[10] Delaware, U. D. C. F. T. D. O. (2024, April 17). Getty Images (US), Inc. v. Stability AI, Inc. Justia Dockets & Filings. https://dockets.justia.com/docket/delaware/dedce/1:2023cv00135/81407

[11] Lawler, R. (2023, February 23). The US Copyright Office says you can’t copyright Midjourney AI-generated images. The Verge. https://www.theverge.com/2023/2/22/23611278/midjourney-ai-copyright-office-kristina-kashtanova

[12] High-level summary of the AI Act | EU Artificial Intelligence Act. (n.d.). https://artificialintelligenceact.eu/high-level-summary/

[13] Wang, Runhua (2024) "The Copyright Requirement of Human Authorship for Works Containing Artificial Intelligence-Generated Content," IP Theory: Vol. 13: Iss. 2, Article 2.

https://www.repository.law.indiana.edu/ipt/vol13/iss2/2

Thetechlawworld

Sunday, June 30, 2024

AI, COPYRIGHT AND US LEGAL SYSTEM

Report Abuse