UK begins review to train AI models on copyrighted content

Rate this post

On December 9, OpenAI released its artificial intelligence video generation model Sora to the public in the US and other countries.

Cfoto | Future Publishing | Getty Images

The UK is preparing measures to regulate the use of copyrighted content by tech companies to train their AI models.

The British government on Tuesday launched a consultation aimed at increasing clarity for both the creative industries and AI developers when intellectual property is acquired and then used by AI firms for training purposes.

Some artists and publishers resent the fact that their content is being freely hacked by companies like OpenAI and Google to train large language models – AI models trained on large amounts of data to generate human-like responses.

Large language models are the core technology behind today’s generative AI systems, including models such as OpenAI’s ChatGPT, Google’s Gemini, and Anthropic’s Claude.

Last year The New York Times filed a lawsuit against Microsoft and OpenAI accuse companies of copyright infringement and misuse of intellectual property to develop large language models.

In response, OpenAI disputed the NYT’s claims, saying that using open web data to train AI models should be considered “fair use” and that it provides an “out” for rights holders “because it’s the right thing to do.”

Separately, image sharing platform Getty Images has sued Stability AI, another generative AI firm in the UK, accusing it of removing millions of images from its websites without consent to train its Stable Diffusion AI model. Stability AI disputed the suit, noting that the training and development of its model took place outside the UK

Suggestions to consider

First, the consultation will consider an exception to copyright law for artificial intelligence training when it is used in the context of commercial purposes, but still allows rights holders to retain their rights to control the use of their content.

Second, the consultation will put forward proposed measures to help creators obtain licenses and rewards for use of their content by AI model makers, as well as clarify what material AI developers can use to train their models.

The government said more needs to be done by both creative industries and technology firms to ensure that any standards and requirements for rights protection and transparency are effective, accessible and widely accepted.

The government is also considering proposals to require AI model makers to be more transparent about their model training datasets and how they are obtained, so that rights holders can understand when and how their content is used to train AI.

This can be controversial—tech firms are particularly unpredictable when it comes to the data that powers their coveted algorithms, or how they train them, given the commercial sensitivities involved in revealing these secrets to potential competitors.

Earlier, under former prime minister Rishi Sunak, the government tried to agree a voluntary AI copyright code of practice.

AI copyright rules: UK vs US

In a recent interview with CNBC, the head of app development software firm Appian said he thinks the UK is well-placed to be “a global leader in this.”

Matt Calkins, CEO of Appian, told CNBC that “the UK has put a stake in the ground by declaring the priority of private intellectual property rights.” He cited the 2018 Data Protection Act as an example of how the UK is “closely involved with intellectual property rights”.

The UK is also “not subject to as much lobbying by domestic AI leaders as it is in the US,” Calkins added, meaning it may not be as inclined to bow to pressure from tech giants as politicians.

“Anybody writing AI legislation in the US is going to hear from Amazon, Oracle, Microsoft or Google by the time this bill hits the floor,” Calkins said.

“This is a powerful force that discourages anyone from writing smart laws or protecting the rights of individuals whose intellectual property is wholesaled by these major AI players.”

As tech companies move toward a more “multimodal” form of AI, meaning AI systems that can understand and generate content in the form of images and video as well as text, the issue of potential copyright infringement by AI firms is gaining more attention. .

Last week, OpenAI launched its AI video generation model, Sora, to the public in the US and “most countries internationally”. The tool allows the user to record the desired scene and create a high definition video clip.

Report

Game / Application Name

Your Email: *

Issue: *