Kit Wai Chan - Fotolia


Tips and tools to give AWS network performance a jolt

Don't wait for network performance to lag to do something about it. Proactively monitor AWS workloads to identify troublesome areas that can disrupt the end-user experience.

Poor network performance can introduce all sorts of problems for an enterprise application, including unwanted latency or even inaccessibility. When network performance lags, it can also impair the end-user experience, violate the commitments of an enterprise service-level agreement to its stakeholders and perhaps even impact the company's revenue.

Public clouds, such as AWS, add to these challenges. Applications that reside within a public cloud depend on the provider to manage, maintain and support the network and its access to the internet. Yet, enterprise owners still bear those same obligations to ensure strong network performance for the business, customers and end users. For this reason, it is more important than ever to invest in suitable tools and best practices to monitor AWS network performance.

AWS monitoring tools

To monitor AWS network performance, an enterprise can use various native services from the cloud provider, including:

AWS CloudTrail. This logging tool can watch, record and report on any activity across AWS resources and services. With CloudTrail, you can view all actions made through AWS Management Console and AWS software development kits. Many IT teams use CloudTrail to track resource changes, analyze security and troubleshoot potential problems, which can help an admin see if any network elements have changed and remediate any deviations.

Amazon CloudWatch. AWS' monitoring service can collect metrics, gather log files, set alarms and automatically respond to infrastructure changes. Admins can use CloudWatch to monitor network performance and application health, as well as for capacity planning and troubleshooting. CloudTrail interoperates with CloudWatch and can pass API history data to CloudWatch Logs.

AWS CloudFormation. This service provides an automated, template-based deployment mechanism for AWS resources. An IT team can model its desired infrastructure, and then, CloudFormation securely provisions all required resources and services. CloudTrail can supply a suitable template to CloudFormation, which can automatically create CloudWatch alarms for network-based API calls. While you don't need CloudFormation for network management, it simplifies the creation of metrics and alarms to ease monitoring.

AWS Config. Change is an important and overlooked issue in network performance. AWS Config can discover, assess, find relationships between and identify configurations of AWS resources. With AWS Config, an admin can see the potential impacts of changes before they're made and capture configurations at a given point in time. Config can send reports and alarms anytime a configuration changes. Furthermore, AWS Config Rules can discover changes and compare differences against the desired configuration. Deploy Config and Config Rules to assess network risk and compliance or troubleshoot in the event of unexpected or undesirable network changes.

Third-party tools. Public cloud providers realize the value of visibility into their infrastructures. AWS provides user-facing APIs, which has spawned the proliferation of third-party tools that support network traffic and performance monitoring. There are many third-party networking tools in AWS Marketplace, such as Dynatrace and Gigamon Visibility Platform.

Apply best practices to network monitoring

Monitoring sounds easy: You simply install tools, configure them, collect data, handle alarms and report results. But, in actuality, it can be difficult to monitor infrastructure behavior -- even when a public cloud provider has tools for that task.

Before you attempt to monitor your AWS network performance, consider these strategies to improve your results:

Identify metrics. Your monitoring tool of choice can collect a myriad of standard and custom metrics. Don't attempt to collect and monitor all of them, as this usually results in a mass of unused data and unnecessary storage and data analytics costs.

When you choose which metrics to track, ask yourself this: What do you actually want to know about your network, or what questions are you trying to answer?

For example, let's say a business wants to protect its AWS workload against denial-of-service attacks. In that case, it might watch a metric like excess connection requests. Determine which network attributes you want to monitor, then select a combination of tools that help accomplish those goals. From there, you can configure the appropriate metrics and alarms.

Plan responses. Not only can monitoring tools help your enterprise meet its service-level agreement, compliance and user experience goals, but they can also help when things go wrong. These tools reveal network issues and help determine a clear plan of action to address them. Establish and periodically update a baseline of important metrics that you want to track.

For example, Amazon CloudWatch might monitor network-related metrics, such as NetworkIn, NetworkOut, NetworkPacketsIn and NetworkPacketsOut, for each instance. Relevant metrics and alarms can flag important network problems, but it's important to establish a plan to productively address those issues before trouble strikes.

Other ways to enhance AWS network performance

Ultimately, it might not be enough to simply monitor your network. Cloud application architects need to understand some of the options available to boost workload performance.

One option, for example, is to add instances to enable the app to perform more work. Along with instances, you will generally need to add more network resources to connect workloads to the internet and other services. Use AWS load balancers to distribute network traffic across a larger number of compute and storage instances.

For hybrid cloud environments, you can dedicate a connection between AWS and the local data center to improve network performance. For example, AWS Direct Connect provides a dedicated telecom circuit from the data center to an AWS facility, which eliminates bottleneck issues that can plague communication across the public internet.

Enforce security. Only authorized users should be able to provision or change AWS network resources -- especially production workloads. Implement AWS Identity and Access Management controls to limit access to network resources and identify the person responsible for changes.

Define services. Configurations profoundly influence AWS network performance. In many cases, an organization will establish one or more preferred network configurations that provide, by its standards, adequate performance. You could create and tune an established service catalog to spawn only the AWS network configurations proven to deliver strong performance for a particular app or class of apps. While this approach might not guarantee adequate network performance for all workloads and conditions, it does provide a proven starting point and eliminate many of the errors and oversights that could plague a startup network deployment.

Look for change. Unplanned or excessive changes in resource levels or configurations can adversely affect AWS network performance. Tools, such as CloudTrail and Config Rules, can discover and report configuration changes in AWS resources. Prompt reporting enables prompt remediation.

In addition, organizations should consider policies and procedures to address unplanned or unauthorized resource access in production workloads.

Customize metrics. While CloudWatch supports custom metrics, you can also create custom metrics scripts in a tool such as Elastic Beanstalk. Or, to optimize existing metrics, change characteristics such as resolution, dimensions and how data is reported. While it might not be necessary to customize metrics, it helps you obtain a more detailed and nuanced view of workload performance.

Update alarms. Application workloads rarely remain static over a prolonged period. For example, a user base can grow or shrink over time, or software updates can optimize performance. As these factors change, so do the amount of resources needed to support the workload. Consequently, the current normal conditions might not be normal tomorrow.

As network resources change, periodically review and update performance baselines and alarms. For example, if the alarms for NetworkPacketsIn or NetworkPacketsOut remain unchanged, admins could see more invalid or unnecessary alarms.

Evaluate the architecture. How you deploy a workload on AWS impacts network performance. Be sure to review workload architecture and plan changes to optimize performance. For example, you could move instances from one availability zone with greater latency to an AZ with lower latency, which would dramatically affect network performance. Similarly, new services -- and services becoming available in new AZs -- could offer opportunities to optimize a workload into the future.

Dig Deeper on AWS infrastructure

App Architecture
Cloud Computing
Software Quality