Data separation involves the policies and practices related to isolating data and workloads within a cloud. Data...
separation has three principal goals.
- Compliance. Users know exactly where data and workloads are running, such as specific regions or resource instances.
- Security. Data and workloads are separated in ways that help prevent potential multi-tenancy issues.
- Performance. Data and workloads can be migrated to varied service levels (tiers) based on performance and access frequency needs.
Cloud computing enables organizations to run workloads and manage data anywhere, without significant computing resources residing in their data centers. Public cloud providers use multi-tenant infrastructures, which are efficient and cost-effective, but such multi-tenancy raises concerns about data management and control in the cloud.
Data separation addresses the need to prevent one consumer of cloud services from disrupting or compromising the work or associated data of other consumers of cloud services. In effect, this acknowledges the potential risks of multi-tenant environments, where hypervisor flaws, malicious code running in the applications, excess workload demands and other factors can compromise your workloads.
Thus, data separation in cloud computing involves the insight to know exactly where your workloads and data are running -- even though the very nature of a public cloud is intended to obscure such granular notions -- and then implement decisions to relocate data so that security and performance are optimized.
Why is data separation important?
Consider the noisy neighbor syndrome where your VM instance is running in a public cloud alongside a handful of VMs from myriad other users, all packed onto the same cloud server. Technically, this won't cause any issues until one of the neighboring VMs picks up traffic and takes excess network bandwidth or storage I/O, leaving other VMs -- including yours -- struggling to maintain performance requirements.
Data separation can rely on workload performance analytics to identify flagging performance and prompt cloud administrators to scale or migrate a stressed workload to another resource to alleviate contention and improve performance.
Beyond malicious code and performance sensitivity, today's legal and geopolitical landscape places serious boundaries on where a cloud customer's workloads and data can reside. A public cloud -- such as AWS, Microsoft Azure or Google Cloud -- possesses a global presence comprised of data centers and other points of presence that operate in different countries around the world. In the early days of public cloud, the physical location of servers and storage was largely opaque to users; the very idea of utility computing made such physical distinctions irrelevant.
However, as cloud use has expanded, governments, regulatory bodies and other organizations have become sensitive to the physical realities of global computing infrastructures. Some businesses and government agencies can be severely restricted in a cloud region and tenancy. To address these data separation challenges, cloud providers have given users more control over workload and data placement, as well as reporting.
For example, a business located in the United States, but with operations in the European Union, might be obliged to isolate the data collected from EU customers and keep that separated data located on storage resources within an EU cloud region. This helps meet compliance and legislative demands for businesses operating in the cloud.
Implement a data separation strategy
For organizations, the key to optimize data separation is to exercise more control over the physical placement of workloads and data. However, there are three essential elements needed to implement data separation successfully.
1. Needs assessment
Data separation can take many forms, so the first area of concern is a needs assessment to determine the desired goals or results from a data separation strategy. What should data separation look like for the business? This involves strong collaboration between business leaders, technology experts and legal teams.
2. Data insight
Organizations can host bewildering volumes of data across local, colocation and cloud resources. It's impossible to implement a comprehensive and robust data separation strategy unless the business has clear insight into the data available and its importance to the business. What data does the business have, where is it located now and why is it important to the business? Once that insight becomes clear, the business can start prioritizing the resources needed to improve availability, resilience and performance. This might involve relocating or tiering data, as well as implementing a storage and retention policy.
3. Data security
In terms of data security and better compliance posture, IT teams need to understand that public clouds operate on the basis of a shared responsibility model. The cloud provider is responsible for securing the physical infrastructure, while the user is responsible for securing the workloads and data. Thus, a cloud user's responsibility starts with configuration.
Overlooked or incorrect configuration settings could leave a workload or data exposed, and potentially leave the business vulnerable to compliance violations. To avoid this issue, get familiar with the many different configuration options and best practices for your cloud provider's services. Proper configurations can be streamlined through cloud services -- such as AWS CloudFormation -- that automatically provision and secure cloud resources across regions and accounts using templates or policies.
Another common practice to guard against the risks of multi-tenancy is the extensive use of strong encryption for any data housed within the public cloud. If the data is exposed through misconfiguration or malicious actions, the content remains secure. Ideally, encryption is applied to data both at rest and in transit.
Cloud services for data separation
Additional strategies to implement data separation include the use of various enhanced cloud services intended to bolster the security and control over cloud content. As an example, users can employ a virtual private cloud (VPC), which provisions a logically isolated portion of the public cloud to create a user-defined infrastructure with full control over networking, subnets and other network characteristics. Although VPCs are not physically isolated and are still multi-tenant environments, the level of security is much greater for the organization.
Tiering strategies can help enhance data quality and access needs. AWS S3 Intelligent-Tiering automatically moves data to a suitable cost-effective access tier as access patterns change. For example, data that is accessed less frequently over time can automatically be moved to less expensive (lower performing) storage resources. This type of capability can handle vast quantities of data including data lakes, data analytics repositories and customer content.
Cloud providers are also developing and expanding specialized cloud offerings for performance and security-sensitive users. For example, AWS GovCloud supports numerous U.S. federal standards, including the Criminal Justice Information Systems, the International Traffic in Arms Regulations and the Export Administration Regulations. For additional security and oversight, GovCloud is operated by U.S. citizens within the U.S. and is only accessible to U.S. organizations and account holders that are prescreened.
Similarly, AWS supports the architecting and implementation of storage and workloads for HIPAA-sensitive tasks. Dozens of AWS component services support HIPAA compliance and can help cloud architects develop secure operational environments for medical usage and related tasks.
Cloud providers offer a broad array of dedicated, single-tenant servers and cloud options for users. For example, the Amazon EC2 Dedicated Hosts service offers dedicated server hardware to improve workload performance and compliance. This can also be referred to as a bare metal cloud. Similarly, Amazon EC2 Dedicated Instances can be run in a VPC on hardware that is dedicated to a single customer.
The importance of geolocation
The question of geolocation -- knowing and ensuring the physical area of the world where applications and data reside -- is a more important consideration than ever before. Cloud architects must not only assemble a suitable infrastructure for a cloud workload; they must ensure that infrastructure is provisioned in a suitable physical location with the necessary resources and services. Although simply selecting a specific region is not a tenancy discussion, location can affect workload performance, compliance and tenancy.
Consider that a region can be selected to improve workload performance since the physical proximity to the workload's users can significantly reduce network latency. This placement of data can boost the workload's apparent performance and improve user satisfaction.
Services also vary by region, and not all cloud services may be available in all global regions. This could make it more difficult to deploy or secure workloads or application stacks in some regions. For example, if a needed service or resource isn't currently available in a particular cloud provider's region, the business might be unable to deploy the required environment and run its workload in that region. The solution then would be to rearchitect the environment or to select another suitable region.
Data separation is a team effort
There is no single driver or implementation for data separation, and businesses must approach data separation based on their unique needs. Consequently, it's important to involve a team in any data separation strategy. Business, IT and legal leaders should all have a place in any data separation discussion to ensure that business goals, technical requirements for workload and data performance and legal obligations for corporate governance and compliance are met for every jurisdiction in which the business operates.
Data separation needs and strategies should be reviewed and updated on a regular basis to evaluate changing workload demands as well as evolving legislative and regulatory landscapes. Special circumstances, such as new regulatory legislation or a data breach, should spark immediate reviews.
Dig Deeper on Cloud deployment and architecture
Related Q&A from Stephen J. Bigelow
Though machine learning and neural networks are both forms of AI, neural networks are a specific type of ML algorithm. Learn more about their ... Continue Reading
There are advantages and disadvantages to using NAS or object storage for unstructured data. Find out what to consider when it comes to scalability, ... Continue Reading
Knowing hardware maximums and VM limits ensures you don't overload the system. Learn hypervisor scalability limits for Hyper-V, vSphere, ESXi and ... Continue Reading