AI AND
COPYRIGHT INFRINGEMENT
-Rishiraj Chandan
INTRODUCTION
The
robots are coming for our jobs. The AI took our jobs. It is going to put all of
the artists out of work.
Well,
maybe, maybe not, but artificial intelligence already exists and is posing a
wide range of moral and legal issues. Should AI be granted copyright for the
work it creates? What exactly does that mean? And if the AI does not possess
the copyright, who does? What about teaching AI to create new works of art that
perfectly imitate the manner that a genuine human artist would create by
utilising someone else's work as a model?
We're
going to look at all of that today, and we'll start by asking if an AI can
own a copyright on something? According to existing legislation, an AI
cannot possess any type of copyright in the work that it creates. Why? since
only human writers are granted copyright. The only item that is eligible for
copyright under the existing Copyright Act is an original work of authorship,
and courts have construed authorship to imply a human author. The so-called
monkey selfie case of Naruto v. Slater[1],
in which a Macaque monkey managed to get hold of the photographer's camera,
press the shutter, and snap a photo of itself, is possibly the most well-known
instance involving this particular problem. The group PETA attempted to file a
lawsuit, arguing that the monkey, and not the cameraman, was the true
originator of the image and as such had some legal rights to it. The majority
of the time, the courts ruled that no one else may hold a copyright or be
considered the author for the purposes of the Copyright Act.
The
answer is no if you ask the question in your head yourself too, can a computer
be considered a human person? Computers are not people, at least not for the
time being. Now, an additional issue is that for something to receive a
copyright the work has to be considered minimally creative and expressive.
There's a threshold for the minimum amount of creativity that needs to go in to
a work for it to be considered copyrightable. And this question goes all the
way back to a case from the 19th century called Burrow-Giles Lithography vs
Sarony[2]
that involved the question of whether a photograph could be copyrighted.
Discussion:
Today,
we take it for granted that all images, whether captured on film or digitally,
are copyrighted forms of expression. However, this was a contentious issue back
then, with some arguing that a photograph merely captures how something
genuinely appears in reality. However, US Supreme Court assumed, and this has
now been codified into the Copyright Act, that a photograph taken by a
photographer will have the minimal creativity required for copyrightability,
whether it be in the framing, the settings used on the camera, the posing of
subjects, or the selection of the subject itself, all of which reach some
minimal amount of creativity. Whether you're talking about a painting, a film,
or anything in text, the chances are that the courts will rule that
AI-generated work isn't at least minimally original enough to qualify for
copyright protection since a machine, not a real person, created it. In a
number of cases, people have attempted to create art using AI, registered that
work with the Copyright Office, and declared to the Copyright Office that the
work was wholly produced by AI. Because there was no human authorship or expressive
creativity in such instances, the Copyright Office refused a copyright for
those particular works[3]. One particularly
intriguing case was the employment of an AI to create an entire comic book from
a variety of stimuli. They did, in fact, receive a copyright for that creation.
The Copyright Office is attempting to take away the copyright from that
specific comic book after realising that the work was fully AI produced; we'll
see how it goes.
This
isn't a binary dichotomy, though, at the same time. There is a spectrum here.
You can find anything totally produced by AI at one end of the range. On the
opposite end of the scale, there is something that is wholly human-generated.
However, there are a variety of ways that individuals might utilise AI as a tool.
And for now, it's impossible to predict when the court will rule that there has
been enough human invention to justify copyright. For instance, I can envision
using the Content Aware feature in Photoshop to have the computer
replace a piece of a photo that I have deleted while still leaving the majority
of the image intact. I anticipate that I would still be able to obtain a
copyright for that specific image. And it's at that time, when I'm not actually
providing anything as input, that the AI takes over, at which point the court
rules, "No, you can't get copyright in that."In the future, there
will be several court cases involving it; we'll see how the judges rule.
AI
AND DATA:
However,
a lot of people are also upset about how AIs are trained, and we've seen
several cases where, after training on a data set of a specific author's or
painter's work, the AI does a pretty good job of being able to replicate that
artist's style and produce new works in that original author's style. In
general, you cannot get a copyright on a specific aesthetic, although you can
do so for specific works of art. But is it just, right, or lawful to be able to
train an AI using a dataset that contains works protected by intellectual
property? Because it appears that most AIs are almost worthless unless they
have been trained on a very big dataset, which typically entails collecting
huge amounts of copyrighted content off the internet. And if something appears
to violate someone else's copyright, it probably does.
Copyright
infringement occurs frequently when someone reproduces their work, takes it
from one part of the internet, and incorporates it into a data collection.
However, even while something could first be seen as copyright infringement,
you might have a fair use defence[4], thus the investigation is
far from over. And as it happens, there are a few instances when vast data sets
were created by scraping the internet, which might serve as the foundation for
a fair use defence for all of these researchers building massive data sets
before training an AI on them.
The
first case, Perfect 10 v. Google[5],
concerned Google's search engine literally scraping the whole internet,
including photographs, and then creating thumbnails in search results, smaller
reproductions of the original images. This case is one of the two most
well-known Google-related cases. Perfect 10 was a website that generated
revenue by charging users for access to images. They didn't appreciate how
Google was presenting thumbnails of the photographs that were included on the
Perfect 10 website by scraping their specific website. They sued Google on the
grounds that they were violating their copyright. The courts were in dispute. But
finally, the courts determined that even though Google was infringing on
copyright, it was legal to do so since the thumbnails served a different
function than the original photos. In actuality, the purpose for which
people were utilising the original photographs was not served by the search
feature or the presentation of those tiny thumbnails.
The
Authors Guild v. Google case[6],
which is the second such instance, is known as the Google Books Case. Once again,
Google was creating a sizable database of virtually every book ever published
and letting users search through it. The fact that Google could search through
all of these diverse books did not impress the writers Guild. Again, they
weren't showing the complete book itself, but little excerpts of where the
specific term or word appeared inside a certain book itself, and Google claimed
that this served a fair use purpose in enabling users to browse through all of
these books. And again, the court sided with Google, and while the initial
formation of this data set and reproduction of these books might qualify as
copyright infringement, the court said that Google was allowed to get away with
this search and display because its purpose was for fair use purposes.
However,
that comparison fails since what Google did in both instances was mostly
descriptive. In essence, it was merely a description of what was already
present. AI is unique. It is productive. It can be in trouble since it is
creating something new based on earlier efforts. Regarding actual AI production
of work based on data set, we don't completely know what the court would
conclude. In an AI generating setting, the same type of fair use defence that
Google could have in a descriptive context might not be applicable. Simply
said, we don't know.Both sides have valid points to make. One may argue that an
AI's actions are rather comparable to those of Google's search engine. Another
counterargument is that what the AI is actually doing is what every single
person in the world is doing. They are taking inspiration from the art they see
around them and combining it with their own abilities to produce a fresh piece
of art. Though it's also completely plausible that this will be resolved
statutorily, that we may have a carve out in the law that says that while
people are, of course, performing the same procedure, we're going to say that's
not acceptable and there is no fair use defence if an algorithm in an AI is
doing it.
Analysis:
These
rules are always subject to modification. And that brings us to maybe the most
peculiar circumstance of them all: as we mentioned, an AI cannot own the rights
to a piece of work that it creates. A copyright in the works that the AI
produces is also probably not granted to those who programme it. However, this
does not address the issue of copyright infringement because copyright
infringement can still occur even if you do not have a copyright in the object
in question. For instance, if I instructed an AI text to image generator to
create a cartoon of a mouse wearing red shorts, it may end up creating a
picture that is uncannily similar to Mickey Mouse. I didn't request that it
create something that like Mickey Mouse. Because it looks exactly like Mickey
Mouse, even though neither the AI generator nor I likely hold the copyright to
that image, it still likely violates Disney's copyright. Who is responsible for
the copyright violation in that specific instance? In a sense, everyone may
speculate. Infringing on someone else's copyright is often a strict liability
offence. The coder may theoretically be held accountable. The individual who
gave the prompt itself, theoretically. I would undoubtedly be in violation of
the copyright if I went ahead and copied that photograph and published it. But
there's all kinds of unintended consequences that are potentially possible here
that we don't necessarily have a good answer for. And speaking of unintended
consequences, right on schedule the lawsuits are starting to pour in. There are
already several interesting lawsuits testing out these copyright issues
A
technique developed by Stability AI, which produced the generative tool called
Diffusion, is the target of the first round of lawsuits. In this procedure, the
programme is initially taught to be able to rebuild pictures that it has been
fed. Then, in response to a stated request, it produces fresh visuals. Several
businesses, notably Deviant Art and Midjourney, now employ stable diffusion.
Stability AI has been officially informed by Getty photos of its intention to
sue the business in the UK for illegitimately downloading millions of photos
from its website, which could be against UK law's Standard of Fair Dealing. To
train its programme, a Stability AI copied and analysed millions of Getty
Images. Stable Diffusion is clear about where it obtains its data, in contrast
to the majority of AI startups. Andy Baio and Simon Wilson were able to examine
the training dataset using the open source tools. They discovered that many of
the pictures were from stock photo websites, such 123RF, Shutterstock,
VectorStock, and Getty, among others. And on occasion, the AI even mimicked the
Getty Watermark. The business did not specify if it will also launch a case in
the United States when the Getty complaint was filed at a high court of Justice
in London. However, the business said in a press release that stability's
activities did not adhere to the US definition of fair use. Getty contends that
it is in a comparable situation to the musician whose music was forcibly
distributed via websites like Napster. Getty asserts that it is not attempting
to shut down the business but rather to establish a licence arrangement such to
the one Spotify has with its rights holders. In the meanwhile, Stability AI,
Midjourney, and Deviant Art were the targets of a class action lawsuit brought
by visual artists. The works of the three named plaintiffs, who are artists,
were utilised to train AI software. The webcomic Sarah Scribbles is written and
illustrated by Sarah Andersen, Kelly McKernan is a game, book, and comic book
illustrator, and Karla Ortiz is a concept artist and illustrator who has worked
for Marvel Film Studios and Wizards of the Coast. The three businesses are
accused of violating copyright by utilising the artist's photographs to train
their image generators and create derivative works, according to a California
complaint. The plaintiffs make claims in accordance with unfair competition
statutes as well. The only person with the authority to create or approve an
adaptation of the original now is the copyright owner. According to the
complaint, the image generators are just modern-day collage tools that let
users create unauthorised derivative works. But it seems oversimplified to
describe AI-generated art as a collage in the case; who knows what kinds of
comparisons the court will employ to analyse and comprehend this novel
technology and its ramifications.
There
are strong reasons for both sides of the debate as to whether or not the
produced images would be considered derivative works, but the plaintiffs will
likely encounter some of the problems that we have covered in this article. A
copyrighted image is protected, but an artist's aesthetic is not. In most
cases, decisions about copyright infringement are made picture by image. The
Andersen v. Stability AI case[7]
now asserts that "Stable Diffusion uses the training images to produce
seemingly new images through a mathematical software process when used to
produce images from prompts by its user." These photos are derivative
works of the specific images that Stable Diffusion uses to build together a
specified output, according to the quotation, "New images are based
totally on the training images. In the end, it is just a sophisticated collage
tool. The plaintiff's lawsuit truly gets to the heart of the matter in this
part. The plaintiffs contend that every piece of work produced by AI is a
derivative. An independently copyrightable work that is based on an earlier
work is now referred to as a derivative work.
Compare,
for instance, "Jurassic Park" the movie to "Jurassic Park,"
the novel. The first novel was protected by copyright. Michael Crichton had
complete freedom over the book and anybody who wished to adapt it into a motion
picture. Because it was a derivative work, Steven Spielberg had to obtain a
licence from Crichton in order to film that movie. The official definition of a
derivative is, quote, "A work based upon one or more preexisting works
such as a translation, fictionalisation, motion picture version, sound
recording, art reproduction, abridgment, condensation, or any other form in
which a work may be recast, transformed, or adapted." Now note that a
derivative is considered changed under the Copyright Act.
Conclusion:
Now,
if you know anything about fair use, changed or transformative is practically
the same criterion. This may get extremely complex. Now, there is no doubt that
the original novel was changed by the movie "Jurassic Park". However,
this does not automatically imply that it is a fair use. In order for the
affirmative defence of fair use to be applicable, the use must at least be
transformative and be for a different, sanctioned purpose. It must also satisfy
the other fair usage requirements. And things keep getting trickier. The phrase
"derivative work" in a technical sense does not apply to all works
that draw in any way from previously published works, as stated by one critic. A
work is not considered derivative until it substantially mimics another piece
of work. The work may have partially been influenced by earlier works, but it
is not a derivative work if what is borrowed just consists of ideas and not
their presentation. And as one court noted, "In truth, there are and can
be few if any things which in abstract sense are strictly new and original
throughout in literature, in science, and in art." Every book in the
fields of literature, science, and the arts must and does borrow extensively
from previously known information. If no book could be the subject of copyright
which was not new and original in the elements of which it is composed, there
could be no ground for any copyright in modern times, and we would be obliged
to ascend very high even in antiquity to find work entitled to such eminence.
Virgil
took inspiration from Homer, while Bacon relied on both ancient and modern
writers. Even Shakespeare and Milton would be discovered to have learned a lot
from the rich resources of historical knowledge and classical studies available
to them in their day, proving that Coke had exhausted all of the knowledge of
his profession. - It's difficult to imagine, but back then, people really went
to the theatre on purpose and saw things called plays. The problem with this
copyright litigation, as with many others, is that not all works of art are
new, unpublished, or legally competent to be protected. It varies. Furthermore,
the Stability AI lawsuit continues, "Up until now, a buyer seeking a new
image in a particular artist's style has had to pay to commission or licence an
original image from that artist." Now, it is simply incorrect technically
speaking. The styles of artists are not protected. You can go to another artist
and request that they create an artwork in their style but with an entirely
different theme. And if you did, it wouldn't be considered a copyright
violation. With very few exceptions, you don't have to pay to hire a unique
artist to create something in another artist's style.
So,
for instance, if a painting by Vincent Van Gogh was still protected by
copyright, you would not be required to pay a fee to Van Gogh if you approached
another artist who was not Van Gogh and requested a painting of a robot on a
spaceship in the manner of Starry Night. You are not protected from that type
of fashion. And that's one of the copyright's counterintuitive features. You
might certainly duplicate anything, but if the finished product differs
significantly from the original, it is not a violation of copyright. Consider
that I'm a painter and I'm attempting to recreate the Mona Lisa. I try to
imitate the Mona Lisa with it there in front of me, but I'm such a bad painter
that it basically simply turned out as a smiling face. Now, even though I
attempted to duplicate another work and essentially failed, it is not
infringement since, supposing the Mona Lisa was still protected by copyright,
my painting had little resemblance to the original. And the reason for such is
that the criteria for copyright infringement are typically described as follows.
The
act of real copying must first be proven by the plaintiff. Second, they need to
prove that the copied material is substantially identical or that it was
appropriated improperly or illegally. What exactly constitutes unlawful
appropriation is up for debate, but the general consensus is that it must have
been a protectable expression of the earlier work that was copied, and the
amount of that copying must have exceeded de minimis, or the bare minimum that
the law doesn't actually permit. It becomes quite challenging. But I think you
can now clearly identify many of the genuine problems at hand. The databases
that were utilised to train the AI, were they actually copied?
Yes,
and that constitutes copyright infringement unless a fair use defence could
have been raised, which could have been the case in the Perfect 10 and Authors
Guild instances. Then there is the added question of whether art created by AI
based on these taught pictures is inherently a derivative work and if it
violates the copyright of the original work. And that, too, is up for debate.
Perhaps, but most likely not. This is because copyright decisions are nearly
typically made on an individual basis. You must examine one image and contrast
it with the original. You can only decide whether there has been copyright
infringement after that. In general, you can't claim that you are looking at
the method but not the result to determine if anything produced by this
technique is inevitably an infringement. Even if the defendants in this case
agree that there was copying, the issue is not resolved. And that is only the
start of the conversation. The debate is here to stay and that too for long!
[1]
https://www.lexisnexis.com/community/casebrief/p/casebrief-naruto-v-slater
[2]
https://www.lexisnexis.com/community/casebrief/p/casebrief-burrow-giles-lithographic-co-v-sarony
[3]
https://www.smithsonianmag.com/smart-news/us-copyright-office-rules-ai-art-cant-be-copyrighted-180979808/
[4]
What is fair use? Available at https://fairuse.stanford.edu/overview/fair-use/what-is-fair-use/
[5]
508 F.3d 1146 (9th Cir. 2007)
[6]
804 F.3d 202 (2d Cir. 2015)
[7]
Docket No. 3:23-cv-00201 (N.D. Cal. Jan 13, 2023), Court Docket
No comments:
Post a Comment