Build an AWS incident response plan in 4 key steps

Enterprises should develop and test an incident response plan for an AWS deployment -- before a hack ever occurs. Follow these four steps to get started.

Despite IT's best efforts, hackers still find ways to penetrate even the most sophisticated AWS systems. This makes it critical to create and practice an AWS incident response plan to minimize the effects of a breach and ensure you properly address the root cause.

One of the worst things an enterprise can do is simply refresh affected systems with an older version of an application that's presumably secure. This response could perpetuate the problem, delete valuable evidence, obscure the root cause and alert the hackers, who might take steps to hide their tracks.

At a high level, there are four steps an enterprise should follow to implement an AWS incident response plan:

  1. Create a shared understanding of a potential incident.
  2. Develop a process to quickly and automatically capture all data related to an incident.
  3. Establish a relationship and handoff procedure with a security forensics team.
  4. Practice executing the plan to identify gaps and reduce anxiety when a real problem arises.

Let's take a closer look at each of these steps.

1. Cultivate incident awareness

An enterprise might be alerted to an incident in a variety of ways. It might receive an alert from AWS, discover its AWS credentials or data on the dark web or receive a notification of unusual activity from a partner. Ideally, your organization will pay close enough attention to the subtle anomalies in the ways an app is configured, run and accessed to discover problems before any of these actions occur.

Set up Amazon CloudWatch alarms as an early warning system to help identify these anomalies. Look for spikes in usage charges, which, among other factors, like poor coding practices or unanticipated service growth, could be a sign of intrusion via compromised security credentials.

However, while these anomalies can help with early detection of an incident, they can also overwhelm your team with false positives, which might make them reluctant to investigate signs of a real threat. Still, establish a process to pass along observed anomalies to a skilled security team to determine if they represent an incident.

2. Capture everything

An AWS incident response plan must outline procedures to quickly, automatically, quietly and securely capture everything that might be relevant to a resolution. Capture all related API calls, configuration changes, running application data, nonvolatile memory and TCP data.

Automate as much of this process as possible, as IT professionals are more likely to make mistakes manually during a crisis, and move forward quietly so that hackers don't attempt to cover their tracks. Store forensic data outside the control of operational access credentials to make it harder for hackers to alter or delete.

AWS CloudTrail can automatically capture API call data from AWS Management Console, Command Line Interface or the software development kit. This log helps forensic investigators identify the steps attackers took after they gained access to a system. It might also be a good idea to set up an alert for unexpected spikes in API calls.

AWS Config can create a log of AWS resource configuration changes after an incident. This can help you create a network diagram that displays resource creation, deletion and changes. Additionally, AWS Config Rules can automatically compare the state of actual configurations against a predefined configuration file and generate alerts when drift occurs. You can use these alerts to trigger Lambda functions to automatically remediate misconfigurations.

What's more, AWS Identify and Access Management includes an Access Advisor feature that logs services that a given account accesses, which can be useful to identify compromised credentials associated with an incident.

You can also use the create-snapshot command to capture a snapshot of a running EC2 instance and then store it in S3, where only a different set of access credentials can delete it. A snapshot essentially copies all the data running on the virtual hard drive of an EC2 instance. If, however, the instance runs from an S3 bucket, it's important to securely duplicate this S3 data as well.

In addition, capture the memory contents of an EC2 instance associated with an incident, as it will otherwise automatically vanish when the instance terminates. This is relatively straightforward for Windows instances, as most Windows memory dump tools don't impact running apps.

However, it can be more difficult to capture the memory contents of running Linux containers, as these memory capture tools can sometimes corrupt running applications. Try these tools on the same apps running in a test environment before you use them on live systems -- that is, until you find a way to safely capture the memory contents.

A log of TCP communications to and from an EC2 server can also prove invaluable to understand how an attacker operates. IT personnel can run the tcpdump packet analyzer, which is baked into Linux, or WinDump on Windows to capture packet data.

3. The forensic handoff

The IT team typically initiates the incident data capture process as soon as it identifies a potential threat. But an AWS incident response plan must include a process to hand off this information to a skilled forensics team, which can analyze the data, identify the root cause of the problem and recommend a process for fixing things.

4. Practice

Your team should also practice the execution of an AWS incident response plan, including organization of incident data for the forensics team. This helps the forensics team streamline the way this data is captured and makes it easier for them to quickly analyze the data with their favorite tools and practices.

Dig Deeper on AWS infrastructure

App Architecture
Cloud Computing
Software Quality