Definition

AI watermarking

Lev Craig

By

Lev Craig

Published: Oct 31, 2023

What is AI watermarking?

AI watermarking is the process of embedding a recognizable, unique signal into the output of an artificial intelligence model, such as text or an image, to identify that content as AI generated. That signal, known as a watermark, can then be detected by algorithms designed to scan for it.

Ideally, an AI watermark should be invisible to the naked eye, but extractable using specialized software or algorithms. A generative AI model that incorporates watermarking can be used like any other model, but model output will indicate explicitly that it was created using AI. Effective AI watermarking should also avoid impairing model performance; resist attempts at forgery, removal or modification; and be compatible with a range of model architectures.

AI watermarking is a relatively new technique that has seen increased interest in the wake of consumer-facing text and image generators, which have made it much easier to create believable content using AI. In March 2023, for instance, an image of the pope wearing a white puffer jacket was created using the image generator Midjourney and went viral on social media, where many users believed the image to be genuine.

Although that example is relatively benign, the ability to widely disseminate high-quality content produced by generative AI raises broader concerns about AI-manipulated media. For example, AI-generated images could be used to spread political misinformation and create deepfakes, while AI-generated text could help malicious actors conduct phishing campaigns and scams at a larger scale. As AI systems become capable of producing increasingly convincing output and AI-generated media becomes more prevalent online, researchers are exploring how to use hidden signals to indicate the origin of that content to audiences.

How AI watermarking works

The AI watermarking process involves two stages: watermark encoding during model training and watermark detection after output generation.

AI watermarks are created during model training by teaching the model to embed a specific signal or identifier in generated content -- for example, a textual watermark hidden in a sentence generated by a large language model (LLM) or a visual watermark concealed in the output of an image generator. This process usually involves making subtle changes to the model during the training stage, such as alterations to model weights or features.

After model training and deployment, specialized algorithms detect the presence of the watermark embedded earlier, thereby checking whether a piece of media was generated by AI. For example, an algorithm might search for the presence of rare phrases or analyze an image's pixels to detect hidden patterns.

As an example, consider a watermarking technique proposed by Scott Aaronson, a computer scientist and researcher at OpenAI. An LLM such as OpenAI's GPT-4 generates output by predicting the next token -- a natural language processing term referring to a short unit of text, such as a word, syllable or punctuation mark -- based on the previous tokens. Each candidate for the next token is assigned a probability score indicating how likely it is to come next.

Normally, the model randomly selects the next token based on these probability scores. But to create an AI watermark, the model could instead use a cryptographic function whose private key is only accessible to the model's developers. For example, the system might be more likely to choose certain rare words or sequences of tokens that a human would be unlikely to replicate.

The presence of these rare words and phrases would then function as a watermark. To an end user, the text output by the model would still appear randomly generated. However, someone with the cryptographic key could analyze the text to reveal the hidden watermark based on how often the encoded biases occur.

Similar techniques could theoretically be implemented to watermark images. For example, model developers could alter certain weights in early layers of convolutional neural networks to encode noise that functions as a watermark or include watermarked images in training data so that the model's output inherits those markers.

The benefits of AI watermarking

Watermarking AI-generated content has several benefits:

Preventing the spread of AI-generated misinformation. Social media networks, news organizations and other online platforms could use AI watermarks to indicate to readers that a piece of content was created using AI. Adding a disclaimer label to an Instagram post that contains an AI-generated image could help thwart attempts to spread disinformation, for example.
Indicating authorship. Because watermarks trace online content back to a specific creator, they are useful in flagging AI output such as deepfake videos and bot-authored books. This could limit the spread of fraudulent content by helping creators prove that their name or image was used deceptively.
Establishing authenticity. Similar to a physical watermark on paper currency, AI watermarks serve as digital signatures that can demonstrate provenance, or the origin of a piece of media. This could be useful in contexts such as scientific investigations or legal proceedings, where research findings or evidence could be scanned for AI watermarks to evaluate their integrity.

The limitations of current AI watermarking techniques

Unfortunately, current AI watermarking techniques are unreliable and relatively easy to circumvent. In January 2023, for example, OpenAI launched an AI text detector for ChatGPT developed by Aaronson and other OpenAI researchers. But just six months later, OpenAI took down the AI classifier tool, citing its "low rate of accuracy."

Developing persistent AI watermarks that not even determined hackers can eliminate remains an open research problem. One significant issue is that watermarks are often easy to remove, particularly in text. For example, text watermarking strategies that involve slightly emphasizing certain words or using specific patterns can be overcome simply by human editing of AI-generated text.

There is also the problem of false positives -- incorrectly identifying a human-created piece of media as the product of AI. Malicious actors could trigger a false positive by adding a watermark to a real image to instill doubt about its authenticity. False positives could also arise through random chance if an image or passage of text happens to mimic the hallmarks of a particular watermark, leading to unfair accusations of plagiarism or deceit.

Other watermarking techniques might work only for specific data sets, showing limitations for fine-tuned models. Challenges remain around ensuring watermarks persist across model versions and applications; creating flexible watermarking techniques that can be applied across model architectures is likely to prove difficult as well.

Finally, finding the right balance when it comes to watermark detectability is another hurdle. Including too much modified data in the training set or altering a model's weights and features too aggressively during training can degrade the model's overall accuracy. Likewise, a too-obvious watermark could make AI-generated content useless -- for example, watermarked text that sounds highly unnatural due to heavily overemphasizing rare words and syntax patterns. But, at the other extreme, subtler watermarks are more vulnerable to tampering and risk being too weak for detectors to notice.

Even if these practical limitations are overcome, widespread AI watermarking could also raise ethical concerns. Namely, embedding unique watermarks into AI-generated content could potentially compromise users' privacy by tracking individuals' use of generative AI tools through watermarking.

Continue Reading About AI watermarking

The implications of generative AI for trust and safety

How to prevent deepfakes in the era of generative AI

How to detect AI-generated content

Pros and cons of AI-generated content

Intersection of generative AI, cybersecurity and digital trust

Dig Deeper on AI technologies

Search Business Analytics

9 examples of business intelligence use cases for companies
BI tools and applications can help improve decision-making, strategic planning and other business functions. Here's a look at ...
10 dashboard design principles and best practices
Dashboards are a key tool for delivering analytics data to business users. Here's how BI teams can design effective dashboards to...
Ng: Biggest benefit of AI may be unlocking unstructured data
Tech entrepreneur Andrew Ng says that in addition to autonomous action, one of the most beneficial applications of AI is enabling...

Search CIO

The essential reading list for today's CIO
IT leaders share seven essential books every CIO should read. These include titles on effective leadership and navigating AI's ...
DEI priorities CIOs should address in 2026
As DEI becomes a responsibility for CIOs, tech leaders must consider how they approach diversity, equity and inclusion, to boost ...
The CIO's real job: Understanding the 'why' behind IT
Informatica CIO, Graeme Thompson, knows that IT often gets lost in operational, technical thinking. His job is to see the big ...

Search Data Management

Snowflake, Anthropic boost partnership with $200M commitment
After first partnering in 2024 to make Claude models available in Cortex AI, the companies plan to collaborate on agentic AI ...
Informatica tightens bond with AWS's AI development tools
New features optimized for joint customers include MCP servers to connect governed data with foundation models and a framework ...
Latest AWS data management features target cost control
As the volume and complexity of enterprise data estates increase, and the size of data workloads grows due to AI development, the...

Search ERP

Accounting software integration examples and benefits CFOs should know
Streamlining order-to-cash processes reduces manual data entry and can improve customer service. Discover other examples of ...
Living up to the hype: Lessons from IoT supply chain wins
In this podcast, Temple University professor Subodha Kumar explains trends in using IoT in supply chains, deployment challenges ...
Process mining software comparison: What CIOs should look at
Process mining can help improve a company’s operational efficiency, resilience and growth. Here’s a comparison of process mining ...

Close