What is anomaly detection?
Anomaly detection is the process of identifying data points, entities or events that fall outside the normal range. An anomaly is anything that deviates from what is standard or expected. Humans and animals do this habitually when they spot a ripe fruit in a tree or a rustle in the grass that stands out from the background and could represent an opportunity or threat. Thus, the concept is sometimes framed as outlier detection or novelty detection.
Anomaly detection has a long history in statistics, driven by analysts and scientists who pored over charts to find elements that stood out. Over the last several decades, researchers have started automating this process using machine learning training techniques designed to find more efficient ways to detect different types of outliers.
In practice, anomaly detection is often used to detect suspicious events, unexpected opportunities or bad data buried in time series data. A suspicious event might indicate a network breach, fraud, crime, disease or faulty equipment. An unexpected opportunity could involve finding a store, product or salesperson that's performing much better than others and should be investigated for insight into improving the business.
An anomaly could also be the result of faulty equipment, broken sensors or a disconnected network. In these instances, a data scientist might want to remove the anomalous data records from further analysis so as not to compromise the development of new algorithms.
How does anomaly detection work?
There are several ways of training machine learning algorithms to detect anomalies. Supervised machine learning techniques are used when you have a labeled data set indicating normal vs. abnormal conditions. For example, a bank or credit card company can develop a process for labeling fraudulent credit card transactions after those transactions have been reported. Medical researchers might similarly label images or data sets indicative of future disease diagnosis. In such instances, supervised machine learning models can be trained to detect these known anomalies.
Researchers might start with some previously discovered outliers but suspect that other anomalies also exist. In the scenario of fraudulent credit card transactions, consumers might fail to report suspicious transactions with innocuous-sounding names and of a small value. A data scientist might use reports that include these types of fraudulent transactions to automatically label other like transactions as fraud, using semi-supervised machine learning techniques.
Supervised vs. unsupervised anomaly detection techniques
The supervised and semi-supervised techniques can only detect known anomalies. However, the vast majority of data is unlabeled. In these cases, data scientists might use unsupervised anomaly detection techniques, which can automatically identify exceptional or rare events.
For example, a cloud cost estimator might look for unusual upticks in data egress charges or processing costs that could be caused by a poorly written algorithm. Similarly, an intrusion detection algorithm might look for novel network traffic patterns or a rise in authentication requests. In both cases, unsupervised machine learning techniques might be used to identify data points indicating things that are well outside the range of normal behavior. In contrast, supervised techniques would have to be explicitly trained using examples of previously known deviant behavior.
Different types of anomalies
Broadly speaking, there are three different types of anomalies.
- Global outliers, or point anomalies, occur far outside the range of the rest of a data set.
- Contextual outliers deviate from other points in the same context, e.g., holiday or weekend sales.
- Collective outliers occur when a range of different types of data vary when considered together, for example, ice cream sales and temperature spikes.
Anomaly detection techniques
Many different kinds of machine learning algorithms can be trained to detect anomalies. Some of the most popular anomaly detection methods include the following:
- Density-based algorithms determine when an outlier differs from a larger, hence denser normal data set, using algorithms like K-nearest neighbor and Isolation Forest.
- Cluster-based algorithms evaluate how any point differs from clusters of related data using techniques like K-means cluster analysis.
- Bayesian-network algorithms develop models for estimating the probability that events will occur based on related data and then identifying significant deviations from these predictions.
- Neural network algorithms train a neural network to predict an expected time series and then flag deviations.
Why is anomaly detection important for businesses?
Anomaly detection systems can be used in various ways to improve business, IT and application performance. These systems can also enhance the detection of fraud, security incidents and opportunities for innovation. The following are some other common use cases for anomaly detection:
- Predicting equipment failure.
- Detecting early signs of pending IT failures.
- Detection of pricing glitches.
- Enhanced fraud prevention.
- Identifying DDoS attacks.
- Identifying stores and products that do better than expected.
- Better product quality.
- Enhanced user experience.
- Cloud cost management.
Anomaly detection applications and examples
- In cloud cost management, anomaly detection could look for sudden shifts in resource utilization, such as increased database calls, spikes in egress charges or increased SaaS charges. This could help managers identify whether this increase was caused by a new application version release, security breach, or associated with a successful product launch.
- In cybersecurity, anomaly detection can evaluate thousands of data streams to detect changes in access requests, an uptick in failed authentications or novel traffic patterns that bear further investigation. Anomaly detection is often built into most security appliances and services for intrusion detection systems, web application firewalls and API security tools.
- Application performance management tools commonly monitor logs of all traffic to identify performance issues or failures. In these cases, anomaly detection can allow them to detect new issues not identified with traditional rule-based analysis approaches.
- In banking and finance, anomaly detection is commonly used to identify fraud by correlating factors such as the size of transactions, time, location and spending rate. For example, suspiciously large transactions in a foreign country might be flagged. Or a suspiciously large number of smaller transactions from a new vendor might similarly be investigated.
Challenges of anomaly detection
Challenges in anomaly detection include the following:
- Data infrastructure needs to be scaled to support useful anomalies.
- Data quality issues can reduce the performance of anomaly detection.
- Poor anomaly detection algorithms can inundate users with false alerts.
- It may take a long time to develop a useful baseline to account for normal patterns like holiday sales, heat waves or other normal things that occur less frequently.
Considerations for designing an effective anomaly detection system
Data scientists, IT managers, security managers and business teams must consider several aspects when designing anomaly detection apps to provide the appropriate value.
- Timeliness. What is the time to value? A fraud detection system must operate in seconds, a security system in minutes, while a business trends analysis app might deliver value with daily updates.
- Scale. Is the objective speed or depth of analysis? Analyzing a few metrics can yield fast results, but deeper insight may require thousands or even millions of data streams.
- Rate of change. How quickly do events being analyzed in the data change? Predictive maintenance apps may need to analyze real-time data streams, while business data tends to change more slowly.
- Conciseness. Is there a better way of summarizing insights of interest relevant to decision-makers?
- Defining incidents. How can you automate the process of labeling related types of anomalies to determine root causes and appropriate responses?
- Explainability. Is it enough to determine if an anomalous event has occurred, or should priority be given to algorithms that can explain contributing factors, even if they are not as accurate?
Anomaly detection tools and software
Anomaly detection is generally baked into most modern security, IT management, and fraud detection systems and applications. However, enterprises that want to develop their own anomaly detection algorithms may wish to turn to popular statistics, data science, and mathematical packages and tools. A sampling of popular ones include the following:
- Anodot, a business monitoring platform that can detect anomalies in business and cloud events.
- Amazon SageMaker, a data science platform that supports anomaly detection.
- ELKI, an open source data mining tool.
- Microsoft AI Anomaly detector service for Azure.
- PyOD, an open source anomaly detection library written in Python.
- Scikit-learn, a popular data science library that supports anomaly detection.
- Wolfram Mathematica, an algorithm development tool that supports anomaly detection.
How to customize your company's anomaly detection strategy
Anomaly detection is a complicated endeavor. It is one thing to experiment with new tools for detecting anomalies. But in practice, it isn't easy to reliably detect anomalies of value without inundating users with false positives.
In most cases, it will probably be easier to take advantage of domain-specific tools with built-in anomaly detection capabilities for applications like cloud cost management, IT service management or fraud detection.
Bespoke anomaly detection development makes more sense for companies that want to add anomaly detection capabilities to their own products and services. In these cases, it makes sense to take advantage of open source and proprietary data science platforms like scikit-learn or Mathematica.