As AI deployments have increased over the last few years, data scientists and data engineers have, for the most part, blissfully ignored AI security threats related to how their models are trained and deployed.
But as hackers look for new ways to penetrate enterprise systems, they are likely to start targeting AI pipelines in new ways, said Dawn Song, professor at the University of California, Berkeley and co-founder and CEO of Oasis Labs, a security research firm.
"Attackers are always following the footsteps of new technology," she said at the EmTech Digital Conference in San Francisco. "As AI controls more systems and becomes more capable, the consequences will become more severe."
These kinds of attacks threaten to operate outside the bounds of traditional IT security models by targeting the ways inference engines are built, how they can be compromised in production and how different components of AI tools can also be compromised.
There are two key aspects AI managers need to consider: maintaining the integrity of AI models and protecting the confidentiality of the data used to train them. Problems with integrity can arise because machine learning models may not produce the correct results or may be designed to produce a targeted result. Confidentiality can be compromised when hackers attempt to learn sensitive information about individuals by querying the learning system itself.
Song's research suggests that it's possible to make traffic signs look different to existing autonomous vehicle systems, an example of how attackers could compromise the integrity of a system. Using precisely positioned stickers that look like graffiti, her team was able to confuse self-driving car systems into thinking stop signs and yield signs were speed limit signs.
These types of AI security threats to the integrity of a system can occur when the system is making its inference or when the model is being trained.
Inference time attacks could be used to help a person or object avoid detection. An evasion attack might be incorporated into malware to prevent detection by security software or by fraudsters to evade fraud detection. Security researchers have shown that inference attacks can be used to make someone look like a different person to face detection algorithms, can generate fingerprint master keys that could be used to simulate different live fingerprints or, as mentioned above, can make stop signs look like speed limit signs.
Training time attacks work by poisoning a training data set to fool AI systems into learning an intended outcome. For example, Microsoft's Tay twitter chatbot was taught to interact with people using sexist and anti-Semitic language after being fed this kind of speech by a malicious subset of Twitter users.
These kinds of attacks can be a challenge with crowdsourcing and malicious insiders, Song said. It can be difficult to detect when the model has been poisoned.
Single point of failure compromises integrity
Naveen Rao, general manager of the Artificial Intelligence Products Group at Intel, suggested that one approach to dealing with these kinds of machine learning integrity concerns lies in building more resilient AI systems in which different machine learning components compensate for weaknesses in each other. For example, in the case of autonomous systems being fooled by a doctored stop sign, AI developers could cross-check this data against mapping information as a safeguard.
Another approach might involve experimenting with generative adversarial networks primed to generate examples known to poison traditional machine learning training algorithms. This would make it easier to tease out models that are stronger in the face of these kinds of AI security threats.
Attacks on confidentiality extract secrets during the creation of AI models or by probing these models using specially crafted queries.
"A lot of big data contains information such as Social Security numbers, dates of birth, credit card numbers or health information," Song said.
This data is sometimes used by data scientists and analysts to create AI models. There are several places in the machine learning development pipeline where this data could be extracted.
The first target of attackers might be untrusted tools, like open source data analysis and machine learning frameworks. Once compromised, hackers could siphon off sensitive data using these tools without an analyst noticing. Untrusted infrastructure tools that move data around pose a similar threat.
Implementing a vetting process on the tools used by data scientists could guard against the risks posed by untrusted programs. Secure enclaves like the open source tool Keystone Enclave, for example, can help protect infrastructures. UC Berkeley researchers are also extending this tool to support field programmable gate array processors.
Another approach might use secure computation techniques that mask sensitive data or use homomorphic encryption to build models.
Protecting AI models against specially crafted queries necessitates finding ways to make subtle adjustments to the queries that do not adversely affect the results. For example, Uber has developed a toolkit for differential privacy that can help prevent these types of AI security threats.
Song said AI researchers need to be responsible for developing machine learning technologies that are more resilient against attacks and that ensure stronger confidentiality.
"Security will be one of the biggest challenges in deploying AI, and it will require more community effort," she said.