top of page

Navigating Data Ownership and Copyright in the Age of Artificial Intelligence: Legal Challenges and Implications


Author: Yadla Mahita Shri, National Academy of Legal Studies and Research (NALSAR)


ABSTRACT

With the rise in artificial intelligence (AI) in society, many legal challenges have emerged along with it. This is especially seen in the field of data ownership where with the increase in use of artificial intelligence the question of whether it was the AI who did the work or the person arises. AI is a tool that analyses a large collection of data as training and uses said data to generate new data according to the given task. The use of this powerful tool gives rise to two legal issues: whether it is legal to use data without permission to train the AI systems and whether the data generated from these AI systems can be owned. This paper aims to address these questions and explore the legal implications surrounding data ownership in AI. It is essential for these questions to be addressed as this would have a serious impact on various parties like the industries using AI and the owners of the data used in training these AI systems.

KEYWORDS

Artificial intelligence, data ownership, copyright, fair use, intellectual property, AI-generated content, Indian copyright law, originality, creativity, human involvement, technological innovation.

INTRODUCTION

Artificial intelligence here refers to a type of technology which allows computers and machines to perform certain tasks that humans can do. To be able to perform these tasks the AI system has to be trained by feeding it existing data from various sources so that it can compile these data and generate the required result accordingly. Some examples of AI systems include ChatGPT and Midjourney where the user could enter a prompt of what they are looking for and based on the data the system had gone through it would provide results accordingly. However, these processes of data training and result generation bring forth several complex legal issues. Primary among these is the question of data ownership: who owns the data used to train AI systems, and can the resulting outputs be considered the intellectual property of either the AI or the human behind it? Additionally, concerns about copyright infringement arise when copyrighted materials are used for training without explicit permission. This situation becomes even more complex given the transformative nature of AI-generated outputs, which, while unique, often bear traces of the original data sources.

These legal questions challenge traditional intellectual property frameworks and question the significant ambiguities in copyright law. This paper will explore the implications of AI on data ownership and copyright law, addressing the legal complexities that arise as AI continues to integrate itself into various sectors. Ultimately, understanding these issues is crucial, as the answers will impact not only the industries adopting AI but also the rights of original content creators and the ethical limits of AI use in a data-driven world.

Permission and Fair Use in Training AI with Copyrighted Data

When AI is fed data as training, they take in all kinds of data that includes copyrighted data as well. This raises the question of whether using such data without permission constitutes an infringement of the copyright holder's rights. There are many arguments made for and against this issue where people who defend the AI tend to make arguments mentioning how the AI uses the works to train but when they generate data based on the copied data the AI makes significant enough changes to it compared to the original piece that they cannot say it was copied anymore. Another argument that is made against artists claiming copyright is that one cannot claim copyright for an art style. While this may be true the argument from the artist claims that they could still claim copyright for the individual artworks that are being used in training. Despite these arguments to answer this we must first look into what is considered as fair use of copyrighted material.

Fair use of copyrighted material means that the use of the material in that context does not infringe upon the rights of the owner of that material. To check whether it can be considered fair use or not there are multiple factors that must be taken into consideration these include the purpose, the amount of the work being used, the nature of the material and the effect it has in the market. Thus due to these factors the use of the copyrighted data may or may not be considered as fair use. The claim of fair use is a very important defence and is what decides the cases of perfect 10 vs google and authors guild vs google. Google had the claim of fair use of the copyrighted material in both cases and thus the court ruled in their favour. Generally, the purpose of training AI is considered as fair use. This is due to the fact that AI models transform the data making new works instead of directly copying it. A recent case where this issue came up is when the social media platform X made a new policy where all the data posted there could be used to train ai. This caused unrest amongst the people but was not considered illegal.

When a groundbreaking invention like AI comes up there comes a challenge of balancing the rights of the people without interfering with the progress of innovation. In the words of Shashi Tharoor, a member of parliament from Thiruvananthapuram and former chairman of the parliamentary standing committee for IT, 

“On the one hand, authors would go broke if their copyrights weren’t respected and, on the other, innovation would be impossible if vast amounts of copyrighted work were not allowed to be used to train new AI systems.”

This could also be another reason for allowing AI systems to use copyrighted data to help allow the growth of innovation instead of restricting it.


Ownership of AI Generated Content

Before coming to the main question of ownership another question must be asked first, can data even be owned? The answer to this question depends on the laws and if the law considers that type of data as intellectual property. There may be categories of data such as works of art, patentable inventions or software that are considered as intellectual property which can be owned while anything outside these categories cannot be owned. Thus, not all data can be owned but certain recognised types of data can be owned.

Now comes the question of who can own the data and claim copyright for it or if it is even possible for anyone to claim copyright over an AI generated data. For a content to be able to have copyright it must generally have two conditions. Firstly, it must have a human author and thus copyright can only be claimed by a human. This means that anything non-human like AI cannot own that content and give it copyright. This was an important factor that was discussed in the Naruto vs Slater case where despite the fact that it was the monkey that took the selfie it could not claim ownership over that picture due to the fact that it was not human.

Secondly, to be able to claim copyright over some content it should be recognised that there was a certain amount of creativity put into the content by the person. This was an issue addressed in the case of burrow-giles lithography vs sarony where the implications of this factor were discussed in the context of photography. There was a debate over whether taking a photo was considered to have met the required creativity threshold to be able to consider it a copyrightable work. It was concluded that the different factors that go into taking a picture like lighting, the subject and the position were considered as sufficient creative input from the photographer. Now can the same be said for AI generated content? Despite the fact that there may be creativity involved it is generally mostly done by the AI and not the user. Due to this AI generated works tend to not be able to claim copyright for it; this was seen in the context of a comic “Zara of the Dawn” which was a comic made completely from AI. Despite the fact that the work did have enough creativity to cross the required threshold, that creativity did not come from the author but mostly from the AI. It was due to that reason that the copyright that was initially granted to it was taken back.

In the second factor above there was another factor that was discussed, the factor of human involvement. There are arguments made on whether entering a prompt in the AI is considered enough human involvement or not. This was compared to a camera where it was determined the camera is used as a tool to create their work and it was said that AI played a similar role acting as a tool to create their work as well. Despite this reasoning it was generally considered that using AI to create works is not considered enough human involvement. Thus, AI generated content cannot claim copyright due to these various factors.

Indian Context

AI is becoming an important tool being used all over India as well and with that the question of ownership arises here as well. The owner of a work is considered to be the owner of the copyright of that work according to the Indian copyright law. However, the copyright laws seem to not have taken into consideration the existence of AI and thus in the Copyright Act of 1957, it does not address scenarios like AI generated works and does not recognise AI to be capable of being an author. This brings up doubts over issues such as the ownership and originality of the content created by these systems.

As discussed previously it is clear that there are certain essential criteria for a work to be copyrightable, that is, it must be both creative and original. Section 13 of the Indian Copyright Act states that only original literary, musical, artistic and dramatic works can be qualified to claim copyright protection however it does not mention a strict definition of the word “original” allowing the courts to be flexible with the definition depending on the case. This flexibility is helpful as AI is a topic that is constantly changing as it is still developing and thus the laws surrounding it are not rigid. Since AI generated work is created with the help of data to train it and has a significant lack of human input that is needed in creative works there are many doubts concerning the eligibility of these works for copyright protection. 

The Indian Copyright Act was later amended in 1994 to include computer-generated works which include various creative works that are created using a computer program. It was stated there that the person who creates the work is to be considered as the author. Here it is assumed that when they say person they refer to an actual human according to the Indian laws and does not include AI systems as a person and thus Ai systems cannot be considered as the author in the Indian context as well. Thus, without evidence of human involvement no copyright claims can be upheld.

The case in which Sahni attempted to file a copyright claim is an intriguing example from India concerning the ownership of AI-generated work. In this case Sahini had used the help of an AI program using it to take the art style of the famous painting starry night and apply it on a photograph she took creating a new piece of art combining the both. This led to a confusion on whether they could claim copyright as the author of the work. Initially they were denied copyright for the piece when they claimed to be the sole author of the piece but this later changed when she claimed copyright with a co-author title with the AI (RAGHAV) as well. This was the first time an AI was recognised as the co-author for a work it created and they were allowed to claim copyright for it in that way. This compromise shows the recognition of the human’s creative input while still acknowledging the contribution made by the AI to the work.

Seeing the case above it is clear that the question of whether AI can be considered a person according to the Indian laws is yet to be resolved. Clarifying whether AI is to be considered as a tool or more is an important question that still needs to be addressed. Since AI is something that is still being developed, the laws governing it are flexible and there will be a need for the legislature to keep addressing more issues it brings up. Acknowledging AI-generated content as eligible for copyright could foster technological progress, but it may also disrupt conventional ideas of intellectual property by extending copyright protection to creations beyond human authorship. As India’s copyright law adapts, it will be crucial to develop clear standards that define AI’s involvement in creative work. This will help strike a balance between promoting innovation and protecting the rights of human creators.

LITERATURE REVIEW

There are a number of articles, papers and videos made on this topic as it is a highly relevant topic in the current day. Whether the law favours AI or not is something that would heavily impact the way the world runs and would significantly affect the economy therefore there are many discourses out there regarding this topic. When one looks for the legal discourse, they share a similar view raising similar doubts as well. There were a few articles addressing the situation in India as well saying how the current laws are not fully adapted to the current circumstances.


METHODOLOGY

This article relies on a secondary analysis of qualitative data, drawing on a range of existing sources including research papers, articles and other pertinent documents related to the topic. These resources form the essential data base for this study. Studying arguments made both for and against the topic is important as it helps to grasp the actual current situation putting aside any bias.


CONCLUSION

The increasing prevalence of artificial intelligence raises important questions about data ownership and copyright in the modern era. As AI systems become capable of creating content, the legal framework struggles to keep pace, especially regarding issues of permission for training data and ownership of AI-generated works. The Indian Copyright Act emphasizes on human authorship, the legal recognition of AI-generated works was not addressed. While recent cases, like the Sahni case, have explored AI as a potential co-author, the law remains ambiguous on AI’s role as a creator and owner of the work.

This ambiguity underscores a critical need for legislative updates that address the dual objectives of protecting human authorship and enabling technological innovation. Balancing these interests will be essential to foster a legal environment that supports both human creativity and AI’s potential in content creation, ensuring clarity and fairness in intellectual property law as AI continues to evolve.


REFERENCES

1. Cole Stryker, Eda Kavlakoglu, what is AI?, IBM (Aug. 16, 2024), https://www.ibm.com/topics/artificial-intelligence.

2. Copyright and Fair Use, Harvard University, https://ogc.harvard.edu/pages/copyright-and-fair-use.

3. Alex Ivanovs, X/Twitter has updated its Terms of Service to let it use Posts for AI training, Stackdiary (Sep. 1, 2023), https://stackdiary.com/x-can-now-use-posts-for-ai-training-as-per-terms-of-service/.

4. Sophie Goossens, Jess H. Drabkin, Gerard M. Stegmaier, The thorny issue of data ownership, Reed Smith (Feb. 5, 2024), https://www.reedsmith.com/en/perspectives/ai-in-entertainment-and-media/2024/02/the-thorny-issue-of-data-ownership.

5. Rebekah Harrison-Smith, AI is...Monkey Business?, Student Bytes (Oct., 2024), https://bytes.scl.org/ai-is-monkey-business/#:~:text=AI%20and%20the%20Challenge%20of,is%20 becoming%20order%20to%20apply.

6. Naruto v. Slater, No. 15-cv-4324 (N.D. Cal. filed 2015).

7. Ryan N. Phelan, The Curious Case of Burrow-Giles Lithographic (an 1884 U.S. Supreme Court decision involving “new” camera technology), and how it could help Shape Today’s Thinking on Artificial Intelligence (AI) Inventorship, Marshallip (Aug. 30, 2022), https://www.patentnext.com/2022/08/the-curious-case-of-burrow-giles-lithographic-an-1884-u-s-supreme-court-decision-involving-new-camera-technology-and-how-it-could-help-shape-todays-thinking-on-artificial/.

8. Benj Edwards, AI-generated comic artwork loses US Copyright protection, Arstechnica (Feb. 23, 2023), https://arstechnica.com/information-technology/2023/02/us-copyright-office-withdraws-copyright-for-ai-generated-comic-artwork/.

9. Jess Sawyer, Who Owns AI-Generated Content, Originality.ai (Aug. 8, 2024), https://originality.ai/blog/ai-content-ownership.

10. Katherine Abraham, AI’s right to copy, India Business Law Journal (Sep. 19, 2024), https://law.asia/generative-ai-copyright-law/#:~:text=“The%20creator%20of%20the%20AI,or%20as%20a%20juristic%20person.

11. Sukanya Sarkar, Exclusive: India recognises AI as co-author of copyrighted artwork, Managing IP (Aug. 5, 2021), https://www.managingip.com/article/2a5czmpwixyj23wyqct1c/exclusive-india-recognises-ai-as-co-author-of-copyrighted-artwork.


Jan 3

10 min read

3

462

bottom of page