ChatGPT & Co. – Artificial Intelligence and Copyright
Few topics are currently dominating the public debate as much as the impact of artificial intelligence on our workplace. This also applies to copyright law and the creative industries that depend on it. The following article provides a brief overview of the most important copyright issues arising from the use of AI. All issues can be traced back to the same question: “Who benefits from technological advances?”
Conflicts of Interest
In recent years, AI systems of all kinds have become exponentially more important. So-called generative AI systems, such as ChatGPT and Google Gemini for text or Midjourney for images, have attracted particular attention. These are systems that are able to autonomously generate media content based on human input (so-called “prompts”). Their products are not only deceptively similar to works of human creativity but can also replace them.
The generation of new content takes place in 4 steps:
1. Collection of training data in a database (“corpus”).
2. Analysis of the training data for patterns and correlations, followed by “fine tuning”.
3. User command (“prompt”).
4. Autonomous generation of new content.
The diversity of interests is already evident here. The rightsholders of the pre-existing works used for training (“training data”) want to be remunerated for their use or to be able to prohibit this form of use. Operators of AI systems, on the other hand, are dependent on the training data as it is not possible to generate content without it. They stress that overly strict regulatory requirements will make a key technology of the future unattractive. AI models would not replace existing works, but only create new content. Ultimately, users of the systems in question want to be able to use the works generated on the basis of their specifications as extensively and exclusively as possible.
When a weeks-long strike by screenwriters and actors paralyzed the Hollywood film industry in the summer of 2023, the filmmakers made one demand in particular: their work should not be replaced too much by the use of artificial intelligence (AI) in the future. In the future, specially trained AI models will not only generate scripts and individual scenes, but entire movies.
Is AI-generated Content Protected by Copyright?
Long before ChatGPT, the question of whether AI-generated content – text, photos, software, music – is protected by copyright, and if so, who is entitled to that protection, was debated. This question can be considered settled: according to the creator principle enshrined in almost all copyright laws, copyright protects only the results of human intellectual creation. Machine-generated works, which are created completely autonomously without human intervention, do not enjoy copyright protection. According to this understanding, the work of an artificial intelligence cannot be protected because it is the machine that creates, not the human. Only if the user gives the AI such specific instructions that the design of the work has already been determined, and the AI only has to implement that design, would copyright protection be justified. These requirements are generally not met by today’s text and image generators. They create content autonomously and are ultimately unpredictable.
It has often been argued against the lack of copyright protection for AI products that users of AI often invest considerable effort and creativity in designing a prompt in such a way that the AI model generates the desired content. For example, in a case before the US Copyright Office, the author of a 16-page comic featuring Zarya of the Dawn, whose images were generated by the image AI Midjourney, argued that it had taken her almost a year to develop appropriate prompts and procedures to obtain the desired image using the image AI. In a very readable decision of 21 February 2023, the US Copyright Office rejected this argument, stating that no amount of preparation could change the fact that the photo was created autonomously by the AI.
However, content created by AI can be protected by copyright if it is subsequently edited and developed by humans. However, this requires that the adaptation itself is a personal intellectual creation and is not limited to editorial and technical adaptations. Strict requirements must be met. Creators who are dependent on the exclusivity of the content they have created will nevertheless seek to subject pure AI works to their own creative adaptation to obtain copyright protection.
Protectability of the Prompt
From the point of view of the user of an AI system, the most important input is the prompt, i.e. the instructions to the AI. Good prompts are the basis for good AI output, and a new profession, the prompt engineer (“prompter” for short), has emerged to formulate these instructions. This raises the question of whether prompts are protected as literary works. The answer is yes, provided that the prompt itself meets the requirement of personal intellectual creation. The content generated by the AI model using the prompt is not taken into account in the assessment.
Does the AI Product Infringe Third Party Rights?
The fact that the AI product itself is not protected by copyright does not mean that its distribution cannot infringe the rights of third parties, e.g. the authors of pre-existing works. An important motive for using AI models is to create new content by referencing or using pre-existing works (examples: “Write me a new story with Pipi Longstocking” / “Create a picture in the style of Keith Haring”). Generative AI can identify typical elements of an artist and imitate the recognized style to create a new image that looks like an artist’s image. Similarly, sequels to novels, characters or films can be created.
Whether this content infringes the rights of the authors concerned will always be a question of individual cases but is based on traditional copyright principles: An artist’s style alone is not copyrightable; only the specific design of a work is copyrightable. The underlying style remains in the public domain, so that the artist does not monopolize his/her chosen cultural technique through copyright, thereby restricting the artistic freedom of others. A different decision will have to be made if the AI product refers to specific works and these are recognizable or essential elements do not fade away. An example of this is the well-known AI work “Vincent Creating the Starry Night on His Laptop” by AI artist David R. Smith. The reference to van Gogh’s Starry Night is obvious in that the work does not fade. If the works of van Gogh, who died in 1890, had not long been in the public domain, the AI image would infringe the rights of van Gogh or his legal successors as an unauthorized adaptation (Section 23 of the German Copyright Act, UrhG).
If the AI output is a (recognizable) reproduction or adaptation that infringes the copyright of a third party under these standards, the user must obtain a license from the third party for publication or exploitation or be able to rely on a statutory restriction such as pastiche under Section 51a UrhG.
Use of Existing Works as Training Data
Both small and large AI models are based on training with existing works. In the case of large AI models such as ChatGPT, this is assumed to be almost all the content available on the World Wide Web. The AI then independently analyses the entire volume of text for patterns and correlations to determine the rules on which the language is based (“pre-training”).
To put it bluntly: Generative AI only works as well as it has been trained and analyzed with the existing works.
As a result, authors and rights holders are demanding a ban on the use of their works to train AI modules, or appropriate compensation. Numerous lawsuits are already pending in the US by prominent authors, publishers and image database operators who believe their rights have been infringed by well-known AI models. The outcome of these cases remains to be seen. In any case, the treatment of pre-existing works by AI models is the key issue.
In Europe, however, the EU legislator has given the operators of AI models the right of way. With the Digital Services Market (DSM) Directive, a barrier regulation for machine learning will be introduced in 2019, which will be implemented in Germany through Section 44b UrhG: Section 44b UrhG permits the reproduction of works that are necessary for the automated analysis of these works, unless the rights holder has expressly reserved this type of exploitation (“opt out”). However, due to technical limitations, the possibility of opting out currently plays little role. This means that there is a legal authorization to collect works on a large scale and to create a training corpus from them – without the restriction to research institutions, i.e. also for commercial purposes, provided for under the old legal situation in Section 60d UrhG. The aim of this regulation is to promote technical innovation in the EU and to provide companies with a secure legal framework for AI applications.
However, rights holders are opposed to this interpretation. They point out that the application of any limitation rule must meet the requirements of the three-step test laid down in several international copyright treaties, according to which the application of the limitation must neither impair the normal exploitation of the work nor unreasonably prejudice the legitimate interests of the author. If an AI makes it possible to paint pictures in the style of an artist or to create serialized stories, the application of the exception undermines the basis of exploitation and the livelihood of authors.
The outcome of this discussion is open – the pressure on legislators worldwide to create a (new) legal framework for the use of pre-existing works is growing.
Conclusion
Many of the traditional principles of copyright law can easily be applied to the models of generative artificial intelligence. However, they do not solve the core socio-political problem: how to deal with the fact that creative content – text, images, graphics, software and images – which has traditionally been the responsibility of human creativity and whose exclusive copyright protection has formed the basis of an entire industry, can be generated by AI at the touch of a button? Creative people see their livelihoods threatened. And even if you wanted to give AI products some form of protection that could form the basis for exclusive rights and legal exploitation. This would not change the low value of these products and their reduced licensability.