With generative AI dominating the news in 2023, some people are beginning to question the ethical ramifications of tools such as Stability AI's Stable Diffusion and OpenAI's Dall-E.
On Jan. 17, Getty revealed that it had initiated legal proceedings in London against Stable Diffusion for allegedly infringing on "intellectual property rights including copyright in content owned or represented by Getty Images."
Getty accused Stability AI of unlawfully copying and processing millions of images protected by copyright and their associated metadata owned by Getty Images.
In a statement to TechTarget Editorial, a spokesperson from Stability AI said: "Please know that we take these matters seriously. We are reviewing the documents and will respond accordingly."
The inclusion of metadata in the suit could mean that Getty believes Stability AI accessed Getty's images and the information associated with them without permission, Forrester Research analyst William McKeon-White said.
"If all of this is replicated, then there was really no additional effort done by the team in order to alter these [images]," he said. "Then it wasn't used as a generic training element -- it was used for the specific purpose of eventually commercializing after running it through the AI system."
If the court agrees with Getty that Stability AI accessed the images intentionally for the purpose of commercializing, then that would influence the argument of the "transformative nature of fair use," McKeon-White added. Fair use allows for the use of copyrighted material under certain conditions without obtaining permission from the owner.
In addition, it would mean that Stability AI intentionally copied material from Getty Images.
The artists who are suing generative AI vendors are also trying to deprive the vendors of the protective shield of fair use.
Artists Sarah Anderson, Kelly McKernan and Karla Ortiz filed a copyright infringement lawsuit against vendors Stability AI, Midjourney and DeviantArt in January in California.
Represented by the law firm Joseph Saveri, known for its legal actions against tech and gaming vendors, the artists claim that Stability AI acquired billions of images, including their artwork, without permission to create Stable Diffusion.
Kelly McKernanIndependent artist
The artists claim that the resulting images derived from their original work compete with them and could lead those who previously had to commission them to use Stable Diffusion instead to acquire the same kind of artwork they produce.
"The developers [of these AI systems] believe that it is fair use," said McKernan, who is based in Nashville, Tenn. "They believe the resulting AI artwork is transformative enough that it isn't breaking copyright law. I don't believe that. It is very possible to accidentally create an image that already exists and therefore would be infringing on that artist."
Many artists have reported cases where their names were used as prompts to create artworks similar to theirs, she added.
"It was extremely disrespectful and shortsighted of [the developers] to believe they could just use all of that data for their purposes without any consent or compensation or credit for the artist," McKernan said.
The artists hope the courts can determine that the data set used by Stability AI is unethical since it includes copyrighted images and that developers would need the permission of living artists when using their artwork as part of a data set used to train generative AI tools.
The fair use argument is complicated, but the issue of privacy is more straightforward, said Andrew Burt, managing partner at law firm BNH.AI, a legal AI specialist.
Privacy in this sense refers to whether the developers of the AI tools were allowed to use the data they used in the way they used it. An example is when someone found photos from their healthcare records using Have I Been Trained, a tool that shows if someone's image has been used in an AI data set.
"In the privacy world, these issues are pretty clear," Burt said. "If you are not authorized to use data for a specific purpose, you just can't use that data."
This is partly why the Federal Trade Commission (FTC) has begun to engage in the practice of algorithmic disgorgement, penalizing organizations that deceptively use data to build AI and machine learning models. Companies that did not gain permission to use the data would have no other option but to destroy that data under the FTC's rules.
While the FTC has some enforcement power over illegitimate use of AI, no one knows how the courts will respond, and there is little case law on AI.
Amid such uncertainty, the lawsuits against generative AI systems -- including the suit against Microsoft, OpenAI and their Copilot code-generating system -- would influence how developers of the systems would move forward.
"There is a lot of uncertainty because if the courts end up saying that copyrighted material cannot be used in this manner to train large-scale models, it's going to put a huge dent in how these models are trained," Burt said. "On the other hand, if the courts say these models can be trained on basically any data, whether or not copyrighted, it's going to have significant implications for all of the people who own images the systems are trained on."
The most reasonable outcome is for developers of generative AI systems to be required to get permission before using any data.
"The part of this that is the clearest is you can't just do whatever you want with data as an organization," Burt said. "You need to understand what the limits are for that data, whether they're related to copyright or whether they're related to data privacy restrictions."
It might be in the best interest of those involved in the lawsuits -- vendors, artists and Getty Images -- to find a middle ground outside the court or the legislation, some say.
"Collectively, they have a much better sense of the dynamics of the markets in which they are all operating," said Michael Bennett, director of the education curriculum and business lead for responsible AI at Northeastern University's Institute for Experiential AI. "It's likely that they'll also agree that they'll all be better served if they develop an arrangement internally, rather than having one that's imposed by a court or imposed by legislation."
However, McKernan said she doesn't feel like the generative AI developers will concede so easily.
"I really don't see them giving that up without a big fight," she said. "If they made these databases entirely ethical, meaning no copyrighted information or nothing included without consent, their product isn't going to be as strong. It's as strong as it is because of the artists whose styles they're exploiting."