Getty Images

Tip

AI model theft: Risk and mitigation in the digital era

Enterprises are spending big bucks on developing and training proprietary AI models. But cybercriminals are also eyeing this valuable intellectual property.

AI promises to radically transform businesses and governments, and its tantalizing potential is driving massive investment activity. Alphabet, Amazon, Meta and Microsoft committed to spending more than $300 billion combined in 2025 on AI infrastructure and development, a 46% increase over the previous year. Many more organizations across industries are also investing heavily in AI.

Enterprises aren't the only ones looking to AI for their next revenue opportunity, however. Even as businesses race to develop proprietary AI systems, threat actors are already finding ways to steal them and the sensitive data they process. Research suggests a lack of preparedness on the defensive side. A 2024 survey of 150 IT professionals published by AI security vendor Hidden Layer found that while 97% said their organizations are prioritizing AI security, just 20% are planning and testing for model theft.

What AI model theft is and why it matters

An AI model is computing software trained on a data set to recognize relationships and patterns among new inputs and assess that information to draw conclusions or take action. As foundational elements of AI systems, AI models use algorithms to make decisions and set tasks in motion without human instruction.

Because proprietary AI models are expensive and time-consuming to create and train, one of the most serious threats organizations face is theft of the models themselves. AI model theft is the unsanctioned access, duplication or reverse-engineering of these programs. If threat actors can capture a model's parameters and architecture, they can both establish a copy of the original model for their own use and extract valuable data that was used to train the model.

The possible fallout from AI model theft is significant. Consider the following scenarios:

  • Intellectual property loss. Proprietary AI models and the information they process are highly valuable intellectual property. Losing an AI model to theft could compromise an enterprise's competitive standing and jeopardize its long-term revenue outlook.
  • Sensitive data loss. Cybercriminals could gain access to any sensitive or confidential data used to train a stolen model and, in turn, use that information to breach other assets in the enterprise. Data theft can result in financial losses, damaged customer trust and regulatory fines.
  • Malicious content creation. Bad actors could use a stolen AI model to create malicious content, such as deepfakes, malware and phishing schemes.
  • Reputational damage. An organization that fails to protect its AI systems and sensitive data faces the possibility of serious and long-lasting reputational damage.

AI model theft attack types

The terms AI model theft and model extraction are interchangeable. In model extraction, malicious hackers use query-based attacks to systematically interrogate an AI system with prompts designed to tease out information about the model's architecture and parameters. If successful, model extraction attacks can create a shadow model by reverse-engineering the original. A model inversion attack is a related type of query-based attack that specifically aims to obtain the data an organization used to train its proprietary AI model.

A secondary type of AI model theft attack, called model republishing, involves malicious hackers making a direct copy of a publicly released or stolen AI model without permission. They might retrain it -- in some cases, to behave maliciously -- to better suit their needs.

In their quest to steal an AI model, cybercriminals might use techniques such as side-channel attacks that track system activity, including execution time, power consumption and sound waves, to better understand an AI system's operations.

Finally, classic cyberthreats -- such as malicious insiders and exploitation of misconfigurations or unpatched software -- can indirectly expose AI models to threat actors.

AI model theft prevention and mitigation

To prevent and mitigate AI model theft, OWASP recommends implementing the following security mechanisms:

  • Access control. Put stringent access control measures in place, such as MFA.
  • Backups. Back up the model, including its code and training data, in case it is stolen.
  • Encryption. Encrypt the AI model's code, training data and confidential information.
  • Legal protection. Consider seeking patents or other official intellectual property protections for AI models, which provide clear legal recourse in the case of theft.
  • Model obfuscation. Obfuscate the model's code to make it difficult for malicious hackers to reverse-engineer it using query-based attacks.
  • Monitoring. Monitor and audit the model's activity to identify potential breach attempts before a full-fledged theft occurs.
  • Watermarks. Watermark AI model code and training data to maximize the odds of tracking down thieves.

Amy Larsen DeCarlo has covered the IT industry for more than 30 years, as a journalist, editor and analyst. As a principal analyst at GlobalData, she covers managed security and cloud services.

Dig Deeper on Threats and vulnerabilities