agsandrew - Fotolia


Scale resources automatically with Azure Virtual Machine Scale Sets

Scalability is an attractive capability of cloud computing that reels in enterprises of all sizes. Learn how Virtual Machine Scale Sets in Azure can help you achieve that benefit.

Dynamic scalability is one of the biggest benefits of cloud computing. IT teams no longer a need to pre-provision fixed application architectures to support a specific number of users. Instead, they can add compute resources automatically when demand rises, and the cloud service scales down the unneeded capacity when demand falls.

On the Azure platform, you can automatically scale resources in various ways, depending on the service you use. If you manage your own virtual machines, you can use a feature called Azure Virtual Machine Scale Sets to enable auto scaling. Within a scale set, each virtual machine must share an identical configuration, including its size and OS image. When you enable auto scaling for a scale set, you also configure rules to monitor a specific metric, such as CPU utilization.

Keep in mind that these scale sets can be tricky, since it is still a relatively new feature. Here are three tips to get started with Azure Virtual Machine Scale Sets.

Get comfortable with command-line tools and templates

The actions within your auto scaling rules determine what should happen -- scale out or scale in -- when a threshold is breached.

Support for Azure Virtual Machine Scale Sets in the Azure portal is currently limited, as Microsoft continues to work on a full management interface for scale sets. The current process to create a scale set in the portal requires you to launch a resource manager template. This allows you to create a scale set, but the auto-scaling configuration from the template only supports scaling based on one predefined metric: CPU percentage. To scale based on a different metric value, create and manage your scale set and auto-scaling rules from the command line.

In terms of which metrics you can use for auto-scaling activity, Microsoft currently provides two options. The first option is to create scaling rules based on metric data from Azure Monitor, which is essentially the built-in monitoring service for the Azure platform. Auto-scaling rules can take action based on these metric values, depending on the conditions you define. For example, your scaling rules can target metrics for the virtual machines in the scale set, or a completely different metric, such as Service Bus Queue length.

Another option is to scale based on the values from guest OS metrics. These metrics are recorded from inside the virtual machine through the diagnostics extension. Use this approach to auto scale based on items such as memory usage, which is not currently a metric available in Azure Monitor.

Lastly, Microsoft has built several resource manager templates that are available in the Azure Quick Start GitHub repository. Search the Quick Start page for vmss and you'll find about twenty different templates that will deploy Azure Virtual Machine Scale Sets in a number of different configurations.

Fine tune your auto-scaling settings

Once you build your own scale sets and auto-scaling rules, there are a couple of pitfalls to watch out for. First, keep in mind that you'll typically want two rules for each auto-scaling set: one rule to scale out and an opposite rule to scale in. For example:

  • Scale out: When an average CPU is greater than 75% for a period of 5 minutes, add one virtual machine.
  • Scale in: When an average CPU is less than 25% for a period of 5 minutes, remove one virtual machine.

The scale set will expand and contract evenly based on the average CPU utilization of the virtual machines. The scale in rule ensures that you do not spend money on unused compute resources. Also, notice that there is an adequate margin between the average CPU thresholds in the example above. This is because you wouldn't want to scale out on 75% CPU and scale in on 70% CPU, as it may cause unexpected behavior. Maintain a margin to prevent the system from inadvertently flapping back and forth between scale out and scale in rules.

Dig into the docs

Azure Virtual Machine Scale Sets are well documented and the community regularly updates the topics. To get started, review the Azure Virtual Machine Scale Sets FAQ and the best practices to auto scale virtual machines. Also, don't forget about the Azure Quick Start templates, which include existing code you can use to automate your scale set deployments.

Don't scale too fast

The actions within your auto-scaling rules determine what should happen -- scale out or scale in -- when a threshold is breached. These actions have a cooldown setting that specifies how long the system should wait after the last scaling activity, which averages between one minute and one week.

During a scale out event, it may take a few minutes for your virtual machine to come online and take some of the load off the other resources. If the cooldown setting is too low, you may scale again before your new virtual machine has had a chance to scale out and do its job. If you experience unexpected behavior, like scaling out to more instances than you expected, raise the cooldown period in your auto-scaling rule action. Test these settings thoroughly against your workload with the activity patterns that are specific to your environment.

Next Steps

Which Azure instances are right for your workloads?

Dig Deeper on Cloud infrastructure design and management

Data Center