workload cloud management platform (CMP)

cloud SLA (cloud service-level agreement)

What is a cloud SLA (cloud service-level agreement)?

A cloud SLA (cloud service-level agreement) is an agreement between a cloud service provider and a customer that ensures a minimum level of service is maintained. It guarantees levels of reliability, availability and responsiveness to systems and applications; specifies who governs when there is a service interruption; and describes penalties if service levels are not met.

A cloud computing infrastructure can span geographies, networks and systems that are both physical and virtual. While the exact metrics of a cloud SLA can vary by service provider, the areas covered are uniform:

  • Volume and quality of work (including precision and accuracy).
  • Speed.
  • Responsiveness.
  • Efficiency.

The SLA document aims to establish a mutual understanding of the services, prioritized areas, responsibilities, guarantees and warranties provided by the service provider. It clearly outlines metrics and responsibilities among the parties involved in cloud configurations, such as the specific amount of response time to report or address system failures.

The importance of a cloud SLA

Service-level agreements are fundamental as more organizations rely on external providers for their critical systems, applications and data. A cloud SLA sets expectations and responsibilities of both involved parties. It ensures cloud providers meet certain enterprise-level requirements and provides customers with a clearly defined set of deliverables. It also describes financial penalties, such as credits for service time, if the provider fails to live up to the guaranteed terms and conditions.

A cloud SLA's role is essentially the same as any contract -- it is a blueprint that governs the relationship between a customer and provider. These agreed-upon rules create a trusted foundation upon which a customer commits to use a cloud provider's services. They also reflect the provider's commitment to its quality of service (QoS) and underlying infrastructure.

SLA requirements diagram
Service-level agreements (SLAs) are essential when relying on external providers for critical systems, applications and data.

What to look for in a cloud SLA

Service-level agreements come in three different forms: customer-based, service-based and multilevel-based.

  • Customer SLAs are more unique and customizable by the customer. It defines all the specific needs that the customer organization requires.
  • Service SLAs are more general agreements that offer identical services to multiple customers.
  • Multilevel SLAs are customizable by the customer and enables the customer to integrate multiple conditions into the same system.

The cloud SLA should outline the responsibilities of each party, the acceptable performance parameters, a description of the applications and services covered under the agreement, procedures for monitoring service levels, and a schedule for the remediation of cloud outages. SLAs commonly use technical definitions to quantify the level of service, such as mean time between failures (MTBF) or mean time to repair (MTTR), which specifies a target or minimum value for service-level performance.

Common agreements incorporated in an SLA typically include an agreement overview, a description of services, any exclusions or exemptions, the service-level objective (SLO), security standards, in-place disaster recovery processes, service tracking, change processes and the service termination process.

Diagram showing cloud computing shared responsibility.
Depending on the cloud model you choose, you can control more management of IT assets and services or let cloud providers manage it for you.

Another key area is service availability, which specifies the maximum amount of time a read request can take, how many retries are allowed and other factors. The defined level of services should be specific and measurable so that they can be benchmarked and, if stipulated by the agreement, trigger rewards or penalties accordingly.

The cloud SLA should also define compensation for users if the specifications aren't met. A cloud service provider usually offers a tiered service credit plan that gives users credits based on the discrepancy between SLA specifications and the actual service levels delivered.

Selecting and monitoring cloud SLA metrics

Cloud service-level agreements are detailed to cover areas including the following:

  • Availability.
  • Change management processes.
  • Compliance.
  • Data location, data access and portability.
  • Disaster recovery expectations.
  • Exit strategies.
  • Governance.
  • Performance and uptime statistics.
  • Security specifications such as encryption practices for data protection and data privacy.

Most cloud providers publicly provide details of the service levels that users can expect, and these will likely be the same for all users. However, an enterprise selecting a cloud service might be able to negotiate a more customized deal. For example, the SLA for a cloud storage service might include unique specifications for retention policies, the number of copies to retain and storage locations.

Cloud customers will need to specify metrics to monitor, and as such, should pick a definable and manageable number of metrics that the provider can confidently control. Cloud customers should choose to monitor the metrics that are most aligned with their goals. These should be the most relevant metrics to ensure business needs are being met.

Relevant SLA components chart.
An example of relevant components for a disaster recovery SLA.

Verifying cloud service levels

Customers can monitor service metrics such as uptime, performance, security, etc., through a cloud provider's native tooling or a portal. Another option is to use a third-party tool to track the performance baselines of cloud services, including how resources are allocated (e.g., memory in a virtual machine) and security.

It is important that the cloud SLA uses clear language to define terms. Such language governs, for example, inaccessibility of a service and who is responsible -- slow or intermittent loading might be attributed to latency in the public internet, which is outside the cloud provider's control. Providers also typically specify and exempt any downtimes due to scheduled maintenance periods, which are usually, but not always, regularly scheduled and recurring.

Negotiating a cloud SLA

Most general cloud services are straightforward and universal with little variance, such as infrastructure as a service (IaaS) options. There might be more room to negotiate terms in specific custom areas such as data retention criteria, or in pricing and compensation/penalty. Negotiating power typically scales with the size of the customer, but there might be room to score more favorable terms. Be prepared to negotiate for any customized services or applications delivered through the cloud.

When entering any cloud SLA negotiation, it's important to protect the business by clarifying uptimes. A good SLA protects both the customer and supplier from missed expectations. For example, 99.9% uptime ("three nines") is a common stipulation that translates to nine hours of outage per year; 99.999% ("five nines") means roughly five minutes of annual downtime. Some mission-critical data might require higher levels of availability, such as fractions of a second of annual downtime. Consider multiple regions or zones to help minimize the impact of a major outage.

SLA checklist.

Be aware that some areas of cloud SLA negotiations amount to unnecessary insurance. Few use cases require the highest uptime guarantees, which require extra engineering work and costs, and might be better served with private on-premises infrastructure.

Pay attention to where data resides with a given cloud provider. Many compliance regulations, such as Health Insurance Portability and Accountability Act (HIPAA), require data to be kept in specific regions with certain privacy guidelines. The cloud customer owns and is responsible for this data, so be sure these requirements are built into the SLA and are validated by auditing and reporting.

Finally, the cloud SLA should include an exit strategy that outlines the expectations of the provider to ensure a smooth transition.

Scaling a cloud SLA

Most SLAs are negotiated to meet the customer's current needs, but many businesses change dramatically in size over time. A solid cloud service-level agreement outlines intervals where the contract is reviewed and potentially adjusted to meet an organization's changing needs.

Some vendors build in notification workflows that trigger when a cloud service-level agreement is close to being breached, so new negotiations can be initiated based on the changes in scale. This can cover uptime availability levels or usage that exceeds criteria and might warrant an upgrade to a new service tier.

Cloud SLA examples

Below are links to cloud SLAs from the major public cloud platforms. Many individual cloud services require separate SLAs -- each of these vendors lists dozens of such SLAs.

There's a lot involved in negotiating a cloud SLA. For example, it needs to be precise, specific about costs, based on the customer's needs and objectives and define security details. Learn even more about what you need to know in order to negotiate a cloud agreement.

This was last updated in January 2024

Continue Reading About cloud SLA (cloud service-level agreement)

Dig Deeper on Storage architecture and strategy

Disaster Recovery
Data Backup
Data Center
and ESG