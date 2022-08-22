For many, security isn't the first thing that comes to mind when they hear about chaos engineering. It's likely that even fewer would consider it as a fundamental security practice on par with things like network firewall configuration, identity management and intrusion detection.

However, the growing complexity of modern software security layers alongside increasingly modular and distributed architectures has reached a point where the risk of failure validates the legitimacy of chaos engineering as a security tool. As such, it's not unlikely that chaos engineering may enter the realm of not just routine -- but essential -- security management processes.

Let's examine the reasons chaos engineering is gaining traction as a security management method, detail the way its applied in security scenarios and review some of the best practices to follow when putting it into practice -- including some of the common pitfalls to avoid.

The role of chaos engineering in security Chaos engineering is a broad term that describes the act of performing complex systems tests by injecting failures before an application encounters them in normal operations, monitoring the outcomes and documenting the right course of action. The concept of chaos engineering is often applied to operational hardware -- including networks and server pools -- as well as software development and product testing. Chaos might not sound like the sort of thing a security specialist or compliance team would want to cultivate within their software systems. The goal of chaos engineering, however, is to prevent chaos by identifying inconspicuous problems and potential failures before they occur in production. And, as the practice matures, chaos engineering is garnering more attention in the field of application security. By performing chaos engineering on the security layers directly, security specialists gain an opportunity to broaden the number of situations and attack vectors they are capable of simulating. Additionally, it allows them to test how the relationships between each of the multiple layers and features affect the impact of a certain failure. Eventually, this will reveal areas where security layers fail to create an effective barrier against attacks and intrusions.

Applying chaos to security: Injections and monitoring Chaos engineering testing for security is a matter of balancing two layers. One layer handles the injection of faults; the other is where the monitoring and resolution processes take place. For example, one layer will inject test data to simulate unauthorized access attempts. The other layer will identify issues by watching for signals of security breaches, allowing security teams to locate gaps in access controls. If those injections induce a failure or reveal a hole in any existing security barriers, the monitoring process should identify the exact point and time where the problem or breach occurred. Logs and monitoring data from the application infrastructure side, along with the log of injected security-based faults, will also help correlate any problems related to infrastructure that may pose a security threat. While it's possible to apply chaos engineering to the security and infrastructure separately, this would likely be a mistake. Security breaches can come about not just due to unexpected events indirectly linked to security or threat-prevention tools, but as a result of events in IT infrastructure. For instance, faults in infrastructure often trigger systems to run in a "failure mode" that may not break functional elements. Instead, it may provide a potentially unwanted bypass for certain security elements to allow for fixes.