As more organizations prepare to buy AI-enabled security tools, potential customers have a lot of concerns about the best ways to install and use these products, as well as the technology behind them.
To help enterprises plan properly, SearchSecurity asked two experts -- Kapil Raina, AI Security Alliance founder and vice president of marketing at San Francisco-based authentication firm Preempt Security, and Eran Cohen, Preempt's vice president of product management -- to share some of the key topics both customers and vendors should discuss before new AI-based cybersecurity tools are deployed. Although Preempt specializes in authentication, the executives' recommendations apply to all AI-based security deployments.
What data is needed for effective AI security?
An AI-based model is only as good as the data used to train it, so the first topic enterprises and vendors should discuss is data -- ensuring the right kind, quality and amount is available for the AI product to be successful.
An AI security product won't work if an enterprise doesn't have the data required to feed into it, Raina said. For a product that uses AI to improve authentication, for example, a customer would need to have data that includes Active Directory logs. Other types of AI-based security products might need other data, such as traffic logs or network telemetry.
But not all logs are created equally, Raina added, and enterprises need to ensure the logs provide as much information as possible to the AI model.
That might mean activating verbose logging, a feature that permits systems to record network data in excess of what is usually captured. While the technique can slow performance, it's especially important for AI models because it provides them with much needed context that is impossible to recreate if not captured the first time.
In the meantime, vendors need to ask potential customers about the accuracy of their data.
Specifically, vendors need to understand how a customer will ensure the data used to train and run an AI-enabled security product is high in fidelity and free from bias. Customers must also be able to meet ongoing tuning requirements, Cohen said, in order to maintain the accuracy of the tool and ensure the AI model continues to perform as expected.
"Since data defines the accuracy of the algorithms," Cohen said, "this element is critical for vendors to get right."
Additionally, vendors should ask how important it is for a customer to get real-time results compared to historical results because, Cohen said, "this will determine the data sources, data types, the machine learning algorithms and eventual accuracy" of the data, regardless of whether the AI tool is looking for anomalies in network traffic or user behavior.
In addition to ensuring the accuracy of the data, the second essential topic enterprises and vendors should discuss is the accuracy of the AI model.
The data being processed by an AI-based security tool will define the accuracy of the resulting AI model and how much confidence customers should have in the conclusions generated by an AI-enabled security tool. Because of this, it is particularly important for vendors to measure a customer's tolerance for false positives and false negatives, Cohen said.
Conclusions reached by an AI-enabled security product are not going to be 100% accurate, and AI and machine learning algorithms "give outputs in probabilistic terms," he said. So, customers need to decide how probable a conclusion is deemed to be before it triggers an alert or action.
For example, if an AI-based security product determines with only 50% confidence that behavior by a particular user is an anomaly, should that trigger a specific action or just an alert to spur further investigation by human analysts?
Tuning an AI model requires some assumptions of accuracy, which can change based on the data sources and machine learning algorithms being used, Cohen said. What's more, AI models can "drift" over time. Malicious activity can slowly train an AI model to view anomalous behavior as normal or to draw the wrong conclusions, and inaccurate conclusions can sometimes be mistakenly reinforced, which means it's important to monitor the model's performance.
To that end, customers should ask how an AI security product's accuracy is established. Some vendors determine initial accuracy, as well as provide ongoing testing via human inputs, automated testing or golden data sets, which are curated data sets used to detect significant changes in an AI model's output. Cohen said: "Accuracy varies based on actual deployed environments, as well as the drift of the algorithms over time, so ongoing validation for accuracy is important."
AI model transparency
Algorithmic transparency is another critical topic for both vendors and customers to consider. For an enterprise, understanding the algorithms underpinning AI security software will ensure the results of an AI process can be explained and adjusted if they don't meet quality expectations. This will enable companies to fine-tune how they're integrating AI with their security infrastructure and minimize the occurrence of false positives or false negatives. Not all vendors will share how their algorithms work, which could affect the usefulness of AI over time because it reduces the explainability of the output, Cohen said.
Still, enterprises should try to get as much information as possible, including asking questions about the machine learning models used, whether they are supervised or unsupervised and whether the models can be fine-tuned.
Understanding the AI model, how it makes decisions and how to train it to avoid mistakes is crucial for customers assessing an AI-enabled security product, Raina said.
Understanding setup, operation and cost
AI products are complex and can require steep learning curves, even if an enterprise has the necessary data. Understanding the challenges associated with setup and operation of an AI-enabled security tool is the third major topic to discuss. Enterprises must determine whether the value provided by the AI product is worth their effort, money and resources.
Running a proof of concept, a technique many enterprises use to assess new technology, can be complicated with AI-based tools, Raina said.
"AI and machine learning are things you typically want to run in a new environment, but you can't run it on the vendor's data. You typically want to run it on your own data," Raina said. "So, what does that process look like?"
In addition, Raina said enterprises need to understand the "hidden costs" around the data needed to run an AI security product, including the storage of the data, privacy and security protections, GDPR compliance and ensuring the data remains high quality.
"A good amount of money" in a contract for an AI security product will be spent on services, fine-tuning the model, training -- both the AI model itself and the staff using the product -- and the costs around the data, Raina said. All of these factors can "create additional complexity that you didn't plan for."
Likewise, Cohen suggested vendors need to be attuned to a potential customer's experience and familiarity with technology like AI. If a customer doesn't know what it wants to achieve with AI-based security or doesn't have the resources to adequately deploy AI security products, problems could surface down the road.