What is secondary storage?
Secondary storage is persistent storage for noncritical data that doesn't need to be accessed as frequently as data in primary storage or that doesn't have the same performance or availability requirements. Primary storage typically requires costly, high-performance storage systems, whereas secondary storage systems can function effectively on economical, lower-performing devices that are more appropriate for long-term storage.
Data that doesn't require primary storage can be migrated to secondary storage devices to free up space and improve performance on primary storage devices, while lowering overall storage costs. Organizations typically use secondary storage for backup and disaster recovery (DR) data, archival data, or noncritical active data. Secondary storage is also referred to as auxiliary storage.
Over the years, the term secondary storage has had different meanings. Initially, it referred to a class of non-volatile media that could store data without always being connected to power. Secondary storage might include hard disk drives (HDDs), solid-state drives (SSDs), optical disks, USB flash drives, floppy disks or other devices.
This type of secondary storage stood in contrast to primary storage, which referred to a computer's volatile memory devices such as Random Access Memory (RAM) or data cache. Volatile memory requires a constant source of power. If a volatile device is disconnected from power, the memory is cleared and all data is lost.
Secondary storage has also been distinguished from primary storage based on whether it was external to the computer, as opposed to being an internal component. Any type of memory within the computer was considered primary, and everything connected externally to the computer was considered secondary.
The term secondary storage has also been used to describe external storage devices not connected directly to production servers. In this scenario, the secondary storage devices might be housed in remote locations, but this isn't a requirement.
Although these usages persist today, secondary storage has primarily come to refer to storage that supports data and workloads less critical than those requiring primary storage. In some cases, the term is also used to describe the management of secondary data, either in conjunction with or instead of the hardware on which the data resides.
In general, secondary storage can refer to just about any storage not considered primary storage. Some organizations store archival data in a third tier, separate from the secondary tier and accessed even less frequently. This is called cold storage -- or sometimes tertiary storage -- yet even in this case, secondary storage is still often used as a blanket term to describe all nonprimary storage, including cold storage.
3 types of secondary storage
Data sets stored on secondary storage can include backup data, test and development data, reference data, archived data and older operational data that no longer requires daily access. Organizations might also run analytics against the data to derive additional value, or they might store the data only to meet regulatory requirements.
Secondary storage is commonly used to store backup data that comes from primary storage. The data is copied from the primary storage system to the secondary storage system through the use of replication or other data protection and recovery techniques. To support these operations, the backup system might use specialized software, third-party services, storage system snapshots or other mechanisms.
Data might also be archived for long-term preservation, whether to meet regulatory compliance or maintain business transaction records. Some organizations might store data for years or even indefinitely. Because this data is accessed infrequently and changes little -- if at all -- it is more cost-effective to store the data on high-capacity secondary storage than on expensive primary storage.
Organizations often turn to secondary storage to support three primary use cases:
- Backup and DR. Backup and DR data might reside on a variety of media and systems, usually determined by its volume and how easily and quickly it can be restored. Both processes rely on restoring secondary data to recreate files and applications lost because of user error, malicious attacks or natural events such as hurricanes, earthquakes or fires. In situations where data is highly sensitive or mission-critical, the data might be backed up to redundant arrays to ensure against data loss.
- Archival. Archival data is information that is no longer accessed with any regularity but must be maintained and be accessible if needed, such as data related to internal governance or legal compliance regulations. Because access to archival data is infrequent and doesn't require immediate turnaround, archival storage systems -- e.g., optical storage or removable magnetic media such as tape -- might be offline much of the time.
- Noncritical active data. Many organizations store data that they don't access very frequently but still want it close at hand in case they need it, or they might need to access the data regularly, but performance and availability are not overriding considerations. Some examples of noncritical active data include emails, business files, legal documents or business intelligence. This data can be maintained on less expensive, lower-performing storage, but it must be online and readily available, which means tape or optical media wouldn't be appropriate.
Each use case has its own characteristics that help determine the best storage media and storage system to use to support ongoing operations. While secondary storage does not need to meet the same requirements as primary storage, data recovery can be a crucial component in deploying and maintaining a secondary storage system.
Organizations must be able to ensure that they can replace the information and applications they need to continue operations as seamlessly as possible if they run into issues with their primary storage.
Benefits of secondary storage
There are two main benefits to moving noncritical data from primary storage to secondary storage: to free capacity on primary storage and to lower overall storage costs. Organizations can also realize a third benefit by isolating secondary storage from the main computing network to provide an additional layer of security. They might also host their secondary storage at remote sites as part of their data protection strategies.
Secondary storage provides a lower-cost, higher-capacity storage tier than primary storage, although the stored data might not be as immediately accessible. This tradeoff is worthwhile in some cases, such as when implementing a backup disk appliance or cloud-based backup service.
Backup appliances and cloud services can store vast amounts of data, although accessing the data can require dedicated backup software. Similarly, optical discs and backup tapes must first be mounted onto their respective libraries before they can be read.
Secondary storage vs. primary storage
Secondary storage data resides on non-volatile memory (NVM) devices such as SSDs, HDDs, tape drives and optical media. The devices might be hosted on premises, in data centers, at co-location facilities or by service providers on their cloud platforms. The devices are typically used to protect data for DR or for long-term retention, although they can also be used to support active noncritical workloads.
Secondary storage is considered a lower tier than the primary storage tier. When secondary storage is used for backup and archival purposes, the server's operating system (OS) might not have direct control of the storage system. In some cases, secondary storage devices cannot interact directly with an application.
Primary storage, also called active storage, refers to a storage tier containing frequently accessed, mission-critical applications and their data. The data in this tier might be stored on HDDs or SSDs installed inside a server's chassis or in an external storage array. Although the trend has been toward SSDs, HDDs continue to be used extensively in the data center.
Secondary storage is often referred to as Tier 2 storage, with primary storage referred to as Tier 1 storage. Some primary storage might also be classified as Tier 0 storage, particularly when referring to storage systems that use SSDs or if computer memory is being employed as a storage layer.
Examples of secondary storage devices
External HDDs are commonly used as secondary storage devices, often to support consumer storage requirements. An external HDD is a portable device that attaches directly to a computer via a standard USB port. The HDD can serve as secondary computer storage or as a network drive.
Enterprises seldom deploy consumer-oriented portable devices as secondary storage due to concerns about data security and capacity. Instead, they use portable storage devices that integrate enterprise-class data encryption at the device or cartridge level to prevent unauthorized users from gaining access to the data.
Other media used for enterprise secondary storage include disk-based systems and magnetic tape libraries. When performance of a secondary storage system is important, flash SSDs can be paired with HDDs in a hybrid configuration, such as might be found in a hyper-converged infrastructure.
Some all-flash arrays (AFAs) support replication to third-party disk systems for converged data protection in a tiered storage environment. However, the AFAs themselves typically operate at the primary storage tier, with the data replicated to cheaper secondary storage. All-flash storage is rarely used exclusively for secondary data due to its higher cost and lower write endurance.
In a business environment, an older network-attached storage (NAS) box, storage area network (SAN) or tape library can potentially serve as secondary storage. More recently, object storage devices have been used for secondary storage to lessen the demands on primary storage arrays.
Cloud as a secondary storage tier
The rise of the software as a service (SaaS) model makes it possible to use cloud storage for secondary or tertiary storage. This is especially true for backing up or archiving data.
Cloud-based archiving has emerged as a cost-effective tool to store older data that rarely changes, in comparison to primary storage in a server. Organizations are also turning to cloud platforms for other secondary storage needs, such as backups and DR. They might ship their data packets via broadband internet pipes to platforms such as Amazon Web Services (AWS) or Microsoft Azure.
When organizations use public cloud platforms, they're accessing data stored on physical servers outside of their own data centers, connecting to the service via the internet. This enables users and applications to access data from any device in any location, although customers may incur charges above the monthly cloud subscription for ingress and egress and for running operations on the data.
Because of these costs, along with concerns about data security and availability, many enterprise customers take a cautious approach to selecting the public cloud as a secondary target, although cloud adoption for secondary storage continues to accelerate. The SaaS model enables a company to scale its cloud-based consumption costs based on varying demands.
Even so, some organizations have set up their own private clouds on premises to provide secondary storage services that can be managed internally. In addition, many organizations are now implementing hybrid clouds, hosting some data locally and archiving less active data in a public cloud repository.