Guide to creating a cloud migration testing strategy How to calculate cloud migration costs before you move

Cloud TCO: How to calculate cloud total cost of ownership

Unsure what it'll cost to run your workloads in the cloud? Learn the parameters you need to define in order to get up and running and avoid costly surprises.

It's not hard to estimate what it would cost to purchase a certain amount of cloud-based compute and storage capacity -- after all, vendors publicly list their base prices. But enterprises need a more holistic view of the resources they plan to deploy if they want a true sense of what it will cost to operate in the cloud. Organizations especially want to estimate cloud costs ahead of a cloud migration to help better understand the financial benefits versus continuing to run those workloads on premises.

Editor's note: This article was originally published in December 2019 and updated in November 2021 to add further details on cloud TCO.

What is TCO in cloud computing?

Cloud TCO is a method used to tally the various costs to host, run, integrate, secure and manage workloads in the cloud over their lifetime. These include fees associated with the resources consumed, such as compute, data transfer and storage. It also includes integrations with related cloud services, ranging from security and management tools to machine learning and AI. Even calculating personnel costs for cloud engineers can be part of a cloud TCO equation.

Common cloud cost considerations

Running workloads in the cloud involves many types of costs. These include, but are not limited to, the following:

  • application migration (rehost, refactor or redesign);
  • infrastructure-based resources (compute instance size, data storage requirements, and network and SaaS usage);
  • data transit costs between cloud services;
  • data duplication across regions or availability zones; and
  • future usage/workload growth over time.

There are also intangible costs to consider and account for in a TCO model, such as risk management, flexibility and scalability, which can be difficult to quantify but important in the bigger cost picture. Some of these, such as risk management and specific aspects of security, are partly absorbed by the cloud service provider (CSP). Others, such as flexibility and opportunity cost, reflect how certain costs can restrict or free up the ability to invest in other areas of the business.

How do you calculate cloud TCO vs. on-premises TCO?

To calculate your organization's cloud TCO, start with comparing what it costs to run the same workload on premises and in the cloud. You also must understand the complete functionality required by your application, especially its security requirements and other areas that can add significant costs.

Enterprises need to have a firm handle on their projected cloud TCO, whether it's for a cloud migration or a net-new application. In this article, we review some best practices to determine your cloud TCO as you map out your budgets and how to avoid unexpected surprises once you're up and running.

Understand the cloud financial model

Utilization and time are the most important variables when comparing on-premises infrastructure to managed services, such as IaaS. Typically, the value assigned to an on-premises resource increases as the terms of the deal extend over a longer time frame and as utilization rates rise. That doesn't apply to cloud resources, which are charged on a consumption basis.

To understand your cloud financial model, the first step is to assign a common resource unit to normalize the data in your TCO comparison. A resource unit could be physical servers, virtual servers or gigabytes of storage. The standard unit applies to both on-premises and cloud assets. For the purposes of this article, let's assume you're looking to move to a cloud provider's infrastructure and not refactor applications for PaaS or serverless configurations.

Next, calculate the average resource unit size for that normalized value, along with the basis used to calculate the average. For example, your normalized value could be an average-sized VM, along with its RAM and virtual CPU (vCPU). You should also factor in associated services, such as networking and security, to ensure your calculation is accurate. The calculation for that value would be the total vCPU and RAM divided by the number of VMs.

You also need to model a projected growth rate for your workload. A higher percentage should indicate greater reliance on standardization and automation, which reduces overall costs at scale. Low-growth workloads aren't a great fit for the cloud because organizations won't realize the cost savings like they would for an in-demand application that utilizes the cloud's elasticity and on-demand nature.

Drill down on your TCO model

Once you've determined your workload's needs, decide the starting month for the modeling term. Keep in mind: The first month of any cloud initiative focuses on installations and other startup tasks. Begin your model on the second month to get a more accurate financial picture of your cloud spending.

Decide on the starting capacity, based on your projected requirements for the first true month of usage. Then, determine the optimum capacity utilization percentage and resource units for the end of your modeling term. Set a realistic ceiling of 80% to 90% utilization of maximum capacity.

Factor in any infrastructure overhead and management requirements. For example, include any service management tools and cybersecurity defenses already in place. You want to compare the cost of your on-prem security and management systems to the cloud services you'll need to do the same job. Such overhead reduces the revenue-generating capacity of a fee-based application your company sells to its customers.

IT vendors typically assign pricing and discounts for a maximum of three years for on-premises hardware. Use a monthly unit for analysis, and create your model accordingly -- a longer overall time frame affects the on-prem depreciation component of your cloud TCO analysis.

Finally, determine the usage per month to document the cloud services your organization plans to consume. The goal is to chart your potential usage of services so you can project costs. Consider the typical utilization for a production system is 100%, since these applications run constantly. Conversely, test and development systems may top out at 33% utilization because your teams only use the systems for eight hours a day.

Capture cost components

To capture the granular details that make up your existing on-prem spending and map how that will translate to the cloud, start with your hardware, which typically falls under Capex. On-prem software is also mostly counted as Capex, though it can be billed as Opex, such as with databases. Hardware and software maintenance are also a cost component to factor into your TCO.

Don't forget to assess one-time installation fees from your CSP, software vendor or outsourced professional services firm. These could include expenses needed to hire someone to architect your cloud environment or move on-prem assets to a public cloud. If your company works in the public sector or any other highly regulated industry, there could be more upfront costs needed to cover various security requirements that must be met before an application is deployed to the cloud.

You also need to calculate recurring expenses, such as labor for operations and maintenance. If you have a hybrid cloud environment, include your data center power consumption costs into your cloud TCO. You might also have to consider capacity utilization expenditures that aren't captured in your upfront and Capex costs. For example, software licensing fees may scale based on the VMs you deploy as your user base grows over time.

Ways to reduce cloud TCO

With diligence and foresight, a business can responsibly plan and manage both short-term and long-term cloud costs. You can do several things to reduce your long-term cloud costs:

  • Employ more automation, from provisioning to monitoring.
  • Increase reliance on standardization.
  • Design applications with clear parameters for flexibility and autoscaling.
  • Eliminate idle or abandoned resources and services.
  • Move away from on-demand pricing to consumption-based pricing and planned or reserved instances.

Categorize your potential costs

The various cost components we've discussed so far can be broken down to three categories. For each one, your organization may have one or more cost components to consider:

  1. Product. As a cost component, this includes the on-prem physical servers that host your virtual servers. It also includes the number of racks needed to support those physical servers.
  2. Management. This covers any cost components required to support administration. For example, an AWS user might opt for an outcome-based managed service backed by a service-level agreement, such as the AWS Business Support or Enterprise Support plans.
  3. Industrialization. This includes any cost components that support research, development, automation, documentation or training for the product. It can be hard to quantify the cost benefits behind industrialization, so many TCO comparisons understate the value in in this category. It's a prime culprit why cloud spending can be full of surprises. For example, a cloud migration or new cloud budget may not accurately reflect the work required to automate key cloud management and operations tasks.

For each cost category, decide if the total costs use the same normalized denominator, as defined by the common capacity variable.

You could choose to use a larger number for the management and industrialization categories, especially in a multi-tenant cloud. A public cloud can save on industrialization costs because some back-end management and training costs are picked up by the CSP.

Define value drivers from the on-premises solution

When defining your value drivers from an on-premises setup, take a hard look at your maximum and stable utilization rates in the lowest per-resource unit costs that deliver the highest value. Use an average utilization value to quantify comparison. Utilization results are the overall extent to which servers, your network and other infrastructure are used to deliver services.

The longer an on-premises asset, such as a server or router, is in use, the more value you gain from it. However, longer time frames also factor in increasing operations and maintenance costs, especially as equipment nears end of life.

Define your value drivers for the cloud

At the heart of any cloud migration, there should be some sort of a value driver. Going to the cloud isn't necessarily cheaper, so cost shouldn't be your only deciding factor. However, you'll be in a better position to make an informed decision about the cloud if you know your cloud TCO.

When you define your value drivers for the cloud, consider utilization factors, such as how many hours a day your VMs will run, storage consumption, availability and security. The cloud's pay-as-you-go model provides some economic benefit because it makes resource management more flexible and frees staff to handle other important tasks.

Next Steps

7 must-have steps for a cloud migration checklist

Dig Deeper on Cloud infrastructure design and management

Data Center