In the near future, big data security analytics will become as common as malware detection and vulnerability scanning. That's because these platforms allow enterprises to capture data from multiple, varied data sources, integrate that data in near real time, analyze patterns and detect malicious activity, as well as monitor, report and conduct forensic investigations.
This article highlights some of the most important features of several of the leading big data security analytics tool vendors -- Cybereason, Fortscale, Hexis Cyber Solutions, IBM, LogRhythm, RSA and Splunk -- against the five essential factors essential for realizing the full benefits of these platforms. These factors, as described in detail in the last article in this series, include:
- Unified data management
- Support for multiple data types, including log, vulnerability and flow
- Scalable data ingestion
- Information security-specific analytic tools
- Compliance reporting
Unified data management
Unified data management is the underpinning of a big data security analytics product, as the data management platform stores and queries data across the enterprise. It also has to balance data management features with cost and scalability.
As Hadoop is a widely used big data management platform and associated ecosystem, it isn't surprising to see it used as the basis for a number of big data security analytics platforms. Fortscale, for example, uses the Cloudera Hadoop distribution. This allows the Fortscale platform to scale linearly as new nodes are added to the cluster.
IBM's QRadar uses a distributed data management system that provides horizontal scaling of data storage. In some cases, distributed security information management systems (SIEM) may only need access to local data, but in some situations -- especially forensic analysis -- users may need to search across the distributed platform. IBM QRadar also incorporates a search engine that allows searching across platforms, as well as locally. This big data SIEM, meanwhile, uses data nodes rather than storage area networks, which helps minimize cost and management complexity. This distributed storage model based on data nodes can scale to petabytes of storage -- those organizations need large volumes of long-term storage.
RSA Security Analytics also employs a distributed, federated architecture to enable linear scaling. The analyst workflow in RSA's tool addresses a critical need when scaling to large volumes of data: prioritizing events and tasks to improve the efficiency of analysis.
Hexis Cyber Solutions' Hawkeye Analytics Platform (Hawkeye AP) is built on a data warehouse platform for security event data. In addition to having low-level, scalable data management -- such as the ability to store large volumes of data in files across multiple servers -- it is crucial to have tools for querying data in a structured manner. Hawkeye AP is tuned to store data in a time-partitioned way that eliminates the need for globally rebuilding indexes. It is also designed as a read-only database. This allows for performance optimizations, but more importantly, it ensures data will not be tampered with once it is written. It is worth noting that Hawkeye AP uses columnar data storage -- as opposed to row-oriented storage -- which is optimized for analytics applications.
Support for multiple data types
Volume, velocity and variety are terms often used to describe big data. The variety of security event data poses a number of challenges to data integration to a big data security analytics product.
RSA Security Analytics' answer is to employ a modular architecture to enable the capture of multiple data types while maintaining the ability to add other sources incrementally. The platform is designed to capture large volumes of full network packets, NetFlow data, endpoint data and logs.
Sometimes multiple data types imply multiple security tools. IBM's QRadar, for example, has a vulnerability manager component designed to integrate data from a variety of vulnerability scanners and augment that data with context-relevant information about network usage. IBM Security QRadar Incident Forensics is another specialty module for analyzing security incidents using network flow data and full-packet capture. The forensic tool includes a search engine that scales to terabytes of network data.
LogRhythm's Security Intelligence Platform is another example of a big data security analytics platform with a widespread support for diverse data types, including: system logs, security events, audit logs, machine data, application logs and flow data. The platform analyzes raw data from these sources to generate second-tier data about file integrity, process activity, network communications, user and activity.
Splunk Enterprise Security allows analysts to search data and perform visual correlations to identify malicious events and collect data about the context of those events.
Scalable data ingestion
Big data analytics security products must ingest data form servers, endpoints, networks and other infrastructure components that are constantly changing states. The principal risk of this data ingestion component is that it can't keep up with the influx of incoming data.
Splunk is widely recognized for its broad data ingestion capabilities. The platform not only offers connectors to data sources, but allows for custom connectors as well. Data is stored in a schema-less fashion and indexed on ingestion to enable varying data types while still providing rapid query response.
As for IBM QRadar, it scales from single-appliance deployments to geographically distributed systems. This big data product, like others covered here, is designed to meet the demands of large enterprises. IBM QRadar has been used to process hundreds of thousands of events per second in real-world applications. Small organizations, or those just starting with IBM QRadar, may want to deploy the system in a cloud environment to minimize infrastructure management. Hybrid deployments are also possible. So event and flow may be processed in the cloud with only summarized incident data sent back to on-premises systems.
Another important type of integration is data augmentation. This is the process of adding contextual information to event data as it is collected. For example, RSA Security Analytics enriches network data as it is analyzed by adding details about network sessions, threat indicators and other details that can help analysts understand the broader picture surrounding low-level security data.
How a big data analytics platform collects data is another key consideration. The time it takes to collect data puts a lower bound on how fast security events can be detected. The location of data collection points determines the breadth and types of data collected. The Cybereason Platform, for example, employs sensors that run in user-space of endpoint operating systems, allowing data collection without disrupting user experience or lower-level kernel functions. The Cybereason sensors can collect data even when devices are not connected to the enterprise network.
Security analytics tools
Big data security analytic tools should scale to meet the amount of data produced by an enterprise. Analysts, meanwhile, should be able to query event data at a level of abstraction that takes into account the perspective of an information security standpoint.
Fortscale employs machine learning and statistical analysis -- collectively known as data science techniques -- to adapt to changes in the security environment. These techniques allow Fortscale to drive analysis based on data rather than just predefined rules. As baseline behaviors change on the network, machine learning algorithms can detect the changes without human intervention to update fixed sets of rules.
RSA Security Analytics includes predefined reports and rules to enable analysts to quickly start making use of data collected by the big data analytics SIEM.
Security analytics is also heavily dependent on intelligence about malicious activities. RSA Security Analytics includes the RSA Live service that delivers data processing and correlation rules to RSA Security Analytics deployments. These new rules can be used to analyze new data arriving in real time and historical data stored on the RSA Security Analytics system. Like Fortscale, RSA Security Analytics uses data science techniques to enhance the quality of analysis.
LogRhythm's analytics workflow, meanwhile, includes processing, machine analytics and forensic analytics stages. The processing step transforms data in ways to increase the likelihood that useful patterns will be detected from the raw data. This processing includes time normalization, data classification, metadata tagging and risk contextualization.
Compliance reporting, alerting and monitoring
Compliance reporting of one type or another is a must-have requirement for most enterprises today. It is important to know that reporting regimes included with the big data security platforms being considered by an organization meet its specific compliance needs.
IBM Security QRadar Risk Manager add-on provides tools to manage network device configurations in support of compliance and risk management. Capabilities of the Risk Manger add-on include: automated monitoring, support for multiple vendor product audits, compliance policy assessment and threat modeling.
Fortscale, as noted earlier, uses machine learning algorithms to continually assess changes to baseline activity and detect anomalous events. As the system detects these, it can generate alerts and provide contextual information about events.
RSA Security Analytics comes out of the box with approximately 90 templates to meet the reporting needs of SOX, HIPAA, PCI DSS and other regulations with minimal effort on the part of end users.
Reporting and alerting in SIEM systems is evolving to support much more than fixed reports and simple alerts. The Cybereason Platform is specially designed to automate the detection of malicious activity, for instance. The platform provides an investigation console that consolidates information and visualizes the attack timeline, affected users and devices.
Splunk Enterprise Security provides for continuous monitoring through dashboards that include key security and performance indicators, as well as trending indicators. The platform supports prioritized workflows as well. The Splunk platform also supports tracking highly privileged users and reporting on access attempts to critical applications.
Hawkeye AP comes with 400 reports, which can be adapted to specific requirements. Since Hawkeye AP uses relational database technology and supports ANSI Standard SQL, as well as ODBC and JDBC drivers, there is the option of creating custom reports using widely adopted enterprise reporting tools.
LogRhythm's platform includes risk-prioritized alarms, standard reports, as well as a real-time reporting dashboard. It also includes additional tools for forensic analysts, including: case management tools, an evidence locker and incident-tracking metrics.
The capabilities of big data security analytics tools
Big data security analytics tools are capable of analyzing a broad range of data types while processing large volumes of data. Not all organizations may need all the features of today's big data security analytics products, but those organizations looking for the next important tool for securing enterprise data should consider the role of big data security analytics tools.
IBM QRadar is a logical option for large enterprises and those that need to retain detailed event data. The platform's ability to scale to petabyte scale will be of particular interest to such organizations. Hawkeye AP's data warehouse model and use of columnar storage bring the capabilities of business intelligence reporting to information security, making it a platform to consider when advanced or custom reporting is required. Cybereason should be considered when there is a need to capture event data while devices are offline. RSA Security Analytics and LogRhythm's Security Intelligence Platform, meanwhile, are well-suited to use cases with a wide variety of data types. Splunk provides a wide range of data source connectors, making it another good option for enterprises with a broad array of data sources.
Big data security analytics will likely appeal more to large enterprises but as the cost and complexity of the tools come down, midsize and eventually small businesses will begin to realize the benefits of the technology.
In part one of this series, learn about the basics of big data security analytics in the enterprise
In part two of this series, discover the enterprise use cases for big data security analytics
In part three of this series, find out how to evaluate big data security analytics products