metamorworks - stock.adobe.com

Microsoft whistleblower, OpenAI, the NYT, and ethical AI

The vendor has filed a memorandum to dismiss some of the arguments The New York Times made in its copyright lawsuit. However, it now faces criticism from its own software engineer.


Listen to this article. This audio was generated by AI.

Microsoft faced both internal and external challenges this week. The tech giant on March 4 responded to the New York Times lawsuit against it and partner OpenAI by filing a memorandum to dismiss the complaint.

The day after, an AI engineer at Microsoft sent letters to Federal Trade Commission chair Lina Khan and Microsoft's board of directors detailing his serious concerns about the safety of Microsoft's Copilot Designer image-generating AI model.

Copilot Designer and Shane Jones

Shane Jones, the engineer, has worked for Microsoft for six years and has been testing the image generator.

However, Microsoft has dedicated red teams working continually to identify, prioritize and address potential safety issues and concerns. Jones is not associated with any of these, according to a statement Microsoft provided to TechTarget Editorial.

"We are committed to addressing any and all concerns employees have in accordance with our company policies and appreciate the employee's effort in studying and testing our latest technology to further enhance its safety," according to the statement attributed to a Microsoft spokesperson.

Jones said he warned Microsoft that Copilot Designer, powered by OpenAI's Dall-E image generator, produced images that did not align with the cloud provider's responsible AI principles.

The image model created demons and monsters, sexualized images of women in violent tableaus, and underage drinking and drug use, according to Jones.

He added that the problem was well known by both Microsoft and OpenAI, which disclosed it in its October 2023 Dall-E report card.

"I don't believe we need to wait for government regulation to ensure we are transparent with consumers about AI risks," Jones wrote in a letter to the Microsoft board. "Given our corporate values, we should voluntarily and transparently disclose known AI risks."

In response to Jones, the Microsoft spokesperson said the following in the statement.

When it comes to safety bypasses or concerns that could have a potential impact on our services or our partners, we have established in-product user feedback tools and robust internal reporting channels to properly investigate, prioritize and remediate any issues, which we recommended that the employee utilize so we could appropriately validate and test his concerns.

Microsoft previously facilitated meetings for Jones with product leadership and employees assigned to work on responsible AI to review his concerns. The vendor also encouraged Jones to create an anonymous ID, which he agreed to share with us, to help the product team investigate his prompts and output activity for analysis and reproducibility, Microsoft said.

Jones' letter comes as Google, Microsoft's archrival in the generative AI race, also faces criticism of its Gemini multimodal model.

Google recently turned off the image-generating feature of Gemini after reports that the large language model (LLM) produced wildly inaccurate images of historical figures. For example, it generated images of the Pope as a Black man.

Concern about Jones' argument

However, what separates Google's predicament from Microsoft's is that the problems with Gemini were reported quickly and by multiple people, AI analyst Mark Beccue said.

Meanwhile, "Microsoft's stuff has been out there for a while, and nobody else is saying anything about seeing this issue," Beccue said. "It will be interesting in getting to see if other folks in the public have found this to be true."

With new technology, it's common for hackers and even the creators of the technology to try to break it and see what could go wrong.

In the case of Google's problem with Gemini, which appears to be overcompensating for model bias, multiple sources materialized saying they had encountered the same thing, Beccue said.

Moreover, the problem Jones has spotlighted reflects how difficult it is to ensure accuracy without bias in generative AI systems, he added.

"It's consistent with what seems to be kind of a flaw [with image-generating tools]," Beccue said. "These are tough to do, and that seems to be coming up again."

GenAI problem comes after ethics team axed

However, Jones' concerns are valid, Tinglong Dai, a professor at Johns Hopkins University Carey Business School, said.

"Microsoft can't escape the harmful consequences of the materials generated by its AI tools and will inevitably have to devote vast resources to moderating and stress testing their model and content," he said.

For Jones to feel compelled to go outside of Microsoft is a natural result of the tech giant eliminating its entire ethics and society team, said Davi Ottenheimer, vice president of trust and digital ethics at Inrupt, a startup founded by Tim Berners-Lee, known as the inventor of the World Wide Web.

"They got rid of the people on the inside who would deal with this and make sure the product was safe before it got to our society," Ottenheimer said. "Now they have people coming outside because they no longer have the internal mechanisms to deal with these concerns."

The problems Jones describes are just the beginning, he said.

According to new research released this month by Cornell University, language models show covert racism in the form of dialect prejudice because the public data they are built from is inherently racially biased.

"Language models now incorporate this prejudice, exhibiting covert stereotypes more negative than even human stereotypes about African Americans," Ottenheimer said. "Microsoft is in deep trouble because of the model they've adopted. And the ethicists who are pulling the whistleblower siren have to do so because they got rid of the people inside who would help them."

While Microsoft has not answered in public to Jones' concerns and letters, it responded to The New York Times' lawsuit accusing the tech giant and its partner, OpenAI, of copyright infringement.

Motion to dismiss

In a memorandum filed on Monday, Microsoft asked for part of the complaint to be dismissed and compared the emergence of LLMs to the creation of videocassette recorders (VCR).

The analogy refers to the idea that when VCRs first went into wide use in the 1970s, they threatened Hollywood's position as the sole creator and distributor of movies. However, the invention ended up being fine for the movie industry.

"Lawfully developed AI-powered tools should be allowed to advance responsibly just like valuable technologies of the past," the Microsoft spokesperson said.

The cloud provider argues that the Times is using "its might and its megaphone" to challenge the technology of LLMs.

Microsoft also claims the Times failed to show how it encouraged end users to make infringing use of any of OpenAI's GPT LLM technology. and that the Times' complaint does not show how Microsoft intended to facilitate or enable infringement.

In response to Microsoft's argument, Ian Crosby, lead counsel for the Times' case, said in an article in the Times that Microsoft doesn't dispute that it worked with OpenAI to copy millions of the media giant's works.

He wrote that the VCR argument is odd because VCR makers never argued that it's necessary to use massive copyright infringement to build their products.

Evidence of infringement

While Microsoft's argument might sidestep the question of whether it contributed to infringement, the cloud provider might not have to satisfy that point now because The New York Times didn't come up with a real-life instance of infringement, said Aaron Rubin, a partner in the Strategic Transactions and Licensing group at law firm Gunderson Dettmer.

He noted that the litigation up to now has not included discovery of evidence.

Microsoft can't escape the harmful consequences of the materials generated by its AI tools and will inevitably have to devote vast resources to moderating and stress testing their model and content.
Tinglong DaiProfessor, Johns Hopkins University Carey Business School

"It's just basically The New York Times alleging a number of facts and stating its cause of action and then OpenAI and Microsoft saying, 'Okay, even if we were to take all of these facts as true, they still haven't adequately pleaded a case for copyright infringement,'" Rubin said.

But Microsoft's argument that the Times did not provide substantial evidence of the end user being affected is not substantial because copyright essentially protects society, Ottenheimer said.

"The New York Times is unique in the sense that they're so big and so experienced and so powerful, they have a voice that represents others who may not be able to speak for themselves," he said.

Therefore, what they're doing is going against a large business entity that smaller organizations could not afford to take on.

"If they can't protect their content as the largest, most influential voice on the internet who has content worth protecting, then everybody else loses their ability to protect themselves from what is essentially the bully in the room, which is Microsoft and OpenAI," Ottenheimer continued.

"Copyright law is not really just about protecting the content creator; it's actually about protecting society by encouraging creation," Ottenheimer added. "If creators don't have content controls, which is what OpenAI and Microsoft are arguing ... then they're undermining society."

The VCR argument

Moreover, Microsoft's comparison of VCRs to LLMs shows the AI vendor's lack of understanding, Ottenheimer said.

By using the analogy, the vendor is saying that it is not possible to get the Times from OpenAI. However, with the VCR, movies can be obtained using the technology.

"Microsoft is both saying, 'We're like the VCR,' and saying, 'We're not like the VCR,'" Ottenheimer said.

The argument also doesn't make much sense because it is Microsoft and OpenAI, not end users, who actively chose to allegedly use the Times' copyrighted materials to train their models, Dai said.

However, Microsoft is trying to highlight what it says is the fear of technological innovation in generative AI, Rubin said.

"They're saying The New York Times is trying to inhibit innovation here and that these two industries -- the AI tech world and the media -- can actually coexist and be symbiotic," he said.

However, the problems Microsoft and OpenAI now face were avoidable, Ottenheimer said.

The tech vendors could have built their technology using open and authorized data, he said.

"That is actually the way that everything should have been working and could have been built, and there are standards that do that," he added. "They're choosing not to use them."

Editor's note: This story was updated on March 8, 2024.

Esther Ajao is a TechTarget Editorial news writer and podcast host covering artificial intelligence software and systems.

Dig Deeper on AI technologies

Business Analytics
CIO
Data Management
ERP
Close