AWS auto-scaling lets companies expand and contract service as needed

Contributor Chris Moyer describes how Amazon Web Services customers can use AWS auto-scaling to adjust to changes in demand.

What exactly is auto-scaling on Amazon Web Services (AWS)? In what kinds of circumstances would it be useful?

AWS auto-scaling refers to the ability to provide a "group" of servers that are in charge of a specific task and that should be able to scale automatically by provisioning new servers to the group, or by removing servers from the group, based on a set of defined parameters. Essentially, AWS auto-scaling lets you define when servers should be automatically started or terminated.

AWS auto-scaling, which is one of AWS's most underused functions, is an incredibly powerful feature. For example, if you have a group of Web servers, you could have those in an auto-scaling group that automatically handles adding new servers when you get hit with a lot of traffic. You could also have it auto-scale back down, terminating instances, after your traffic drops below a certain point.

Essentially, AWS auto-scaling lets you define when servers should be automatically started or terminated.

In addition, if you know you always want to have X number of servers running a certain task, you could have an auto-scaling group that simply states that X servers should be running. Then, if one is terminated for some reason -- for instance, an AWS failure, or you terminate it because it's no longer responding -- the AWS auto-scaling group would automatically launch a replacement without you having to request it.

Auto-scaling groups are most commonly used behind elastic load balancers (ELBs). These ELBs can detect how much throughput they are sending, or even how much latency they are encountering when passing through the requests. If your Web servers are behind an ELB, consider putting those servers in an auto-scaling group so that you don't have to manage provisioning new servers whenever your site-traffic patterns change.

Some major AWS cloud computing clients, such as Netflix, have policies that any production server launched in AWS must be launched using an AWS auto-scaling group. That practice ensures servers are always automatically managed and that the instances are always up and running. Any server that has a problem that can't be easily recovered can then just be terminated, and a replacement will automatically appear for it. That capability also ensures that you're only using the servers you need to at the time you need them, and it helps reduce wasteful running of underutilized servers.

Dig Deeper on AWS instances strategy and setup