Securing AI during the development process
AI systems can have their data corrupted or 'poisoned' by bad actors. Luckily, there are protective measures developers can take to ensure their systems remain secure.
There is enormous interest in and momentum around using AI to reduce the need for human monitoring while improving enterprise security. Machine learning and other techniques are used for behavioral threat analytics, anomaly detection and reducing false-positive alerts.
At the same time, private and nation-state cybercriminals are applying AI to the other side of the security coin. Artificial intelligence is used to find vulnerabilities, shape exploits and conduct targeted attacks.
But what about the AI tools themselves? How does an enterprise protect the tools it is building and secure those it is running during the production process?
AI security threats
AI is software, so threats to AI systems include compromises to get to money, confidential information or access to other systems via lateral attacks, as well as denial-of-service (DoS) attacks.
AI systems are vulnerable to a different kind of attack called a corruption of service attack. An adversary may wish not so much to disable an AI system as to reduce its effectiveness. If your competitor uses AI to time trades, throwing the timing off could make the program less effective without making it wholly useless.
Corrupted function can also be a stepping-stone to common attacks. For example, AI used to identify suspicious network activity could potentially be blinded to certain activities if taught to ignore them; malefactors could then aggressively engage in those activities.
These AI-specific vulnerabilities are different from typical IT vulnerabilities in that they are not failures in the architecture or development of the software. Instead, they exploit the fact that AI systems are, by definition, learning systems and have unique vulnerabilities exposed through those learning processes.
Creating secure AI
Enterprises can defend AI software systems by employing secure development approaches such as DevSecOps. Secure libraries and even more secure languages like Rust are ideal. No matter the overarching approach, the development process should ideally include broad-spectrum, automated security testing in the suite of functional tests that are run on every update. Failing to function securely is failing to function, full stop. Tests should include static code scanning, dynamic vulnerability scanning and scripted attacks.
The AI-specific attacks are aimed at an AI model's training data, whether to damage its effectiveness or build some kind of backdoor attack vector. Unfortunately, there is no perfect solution to prevent efforts to poison the training data -- because even internally generated training data can be tampered with by an insider). Fully self-sourcing data for training is growing less common, so most organizations need to focus on the data they get from others.
Data provenance, specifically information tracking and documenting the data supply chain, is crucial metadata to collect and attach to data sets. Although it is likely necessary to put some level of trust in data sources in order to train the models in the first place, the enterprise can't and shouldn't trust blindly. Tracking data sources so that poisoned data can be traced back to its origin not only helps the enterprise learn who not to trust, but also -- if the information on data quality is shared-- winnows untrustworthy sources out of the ecosystem.
Comparing data sets and looking at their data points can offer some hope of spotting anomalous (and potentially poisonous) data. However, the utility of such approaches is directly proportional to how often they are used to evaluate these evolving data sets, as is the overhead cost of the extra processing. That is, making improvements to the ability to spot corrupt data is expensive.
Securing AI systems
Some AI systems are intended to adapt over time in response to data they receive, thus continually train themselves while running. The very act of learning and adapting means that bad actors can continue to try to corrupt their data streams. There is even less an enterprise can do to prevent this without disabling a system's ability to learn and self-modify. To mitigate exposure of confidential data, an IT department can consider putting DLP solutions in front of the AI. Behavioral threat analytics can also mitigate the risk of lateral attacks or the triggering of some other vulnerability, thereby protecting an AI system.
This use of standard solutions to assist securing AI points to an important underlying truth. Beyond the AI-specific threat of poisoned data, it is crucial to remember that an AI system is just another set of applications. Enterprises must ensure that shiny new AI systems are not deployed without the security measures employed in their normal production environments. This means no special exceptions or bypasses of normal change management, etc. Better responses to AI-specific threats are evolving rapidly, but they are meant to be added layers to an already existing defense-in-depth strategy.