The future of data storage must handle heavy volume
Storage manufacturers have an eye on the future with investments in technologies, but it's unclear if they will be able to effectively counter the exponential rise in data volumes.
We're drowning in data, and it continues to pour in from every front. So, where are we going to store it all?
Storage technologies are keeping up, and emerging technologies hope to extend that window over the next few years. But what happens beyond that? The future of data storage is murky at best. But it's not without hope, and today's advances could carry IT into the next decade and beyond.
The state of the storage market and what's needed
In a 2018 study, IDC predicted that the world would need to store 175 zettabytes (ZB) of data by 2025, which represents an average growth rate of 27%. We appear to be on track to either reach or exceed that amount. In fact, Statista puts the total at more than 180 ZB, an estimate that could climb even higher once we understand the full impact of the COVID-19 pandemic.
Several forces have contributed to the staggering growth of data, including big data initiatives, AI and machine learning, an increase in home workers, the growing adoption of 5G networks and the proliferation of IoT.
Throughout this growth, storage manufacturers have generally been able to keep up, even when faced with pandemic and supply chain issues. Although HDDs continue to outpace SSDs, the gap between the two steadily shrinks. Tape drives and media have reemerged. Storage class memory has made slow but continual inroads into the data center. Each year, manufacturers ship more capacity than the previous year, which is likely to continue.
It's up for debate, however, whether manufacturers will be able to meet the future of data storage demands. Although there are several promising technologies in the wings, it could still be many years before they're commercially viable. Even so, many in the industry believe that, given how manufacturers have been able to meet demand in the past, they'll continue to produce enough storage to meet future requirements, either by improving current technologies or introducing new ones.
However, challenges lie ahead. The supply chain remains susceptible to increased costs, global shipping problems, and material and labor shortages. Unexpected events could easily disrupt any delivery point in that chain. Natural resources that manufacture storage and its supporting systems could run out or become prohibitively expensive, disrupting the market even more.
In addition, new data uses may emerge that generate even greater amounts of data, skewing growth projections and leaving the world with inadequate storage. Some storage devices may reach their physical scaling limits, so new technologies will need to handle future workloads. And the risk of another pandemic remains a real threat.
How well can current technologies work?
Storage providers continue to invest in existing technologies in order to deliver bigger capacities and faster performance, both on premises and in the cloud. These investments could go a long way in meeting current and future data storage needs.
Not surprisingly, the big story here is the cloud. Enterprises continue to migrate much of their data to cloud storage, a movement given a boost by the pandemic. But not all of an enterprise's data ends up with one provider, nor does all of it leave the data center. Many enterprises adopt cloud computing models on premises. Together, these forces result in a growing focus on how to manage multi-cloud and hybrid cloud environments and reduce the investment in capacity.
Managing storage across multiple environments is important, with rising concerns about security and compliance, storage costs and the need to derive the most value from data. Storage manufacturers, software companies and cloud service providers offer products to better control storage resources across multiple environments. For example, HPE has partnered with Morpheus Data to help simplify multi-cloud management, Dell Technologies Cloud has incorporated VMware Cloud Foundation into its infrastructure to support hybrid cloud scenarios, and Seagate's Lyve Cloud combines with Cohesity to provide multi-cloud data management and disaster recovery.
The cloud movement has led to the steady adoption of object storage for more complex workloads, such as databases, AI, machine learning and advanced analytics. Object storage is highly scalable and well suited to the growing amounts of data. It has proven so effective, in fact, that it's making its way on premises, where it gets a boost from systems running NVMe SSDs. For example, Dell's Elastic Cloud Storage system has all-flash SSDs and uses NVMe-oF for its back-end network.
Storage protocols, along with storage interfaces, can play a major role in meeting the growing data requirements. They bring greater bandwidth, so they can support more data moving between systems, which could translate to greater storage efficiency. Many storage devices now support PCIe 4.0, which doubles the bandwidth of 3.0. Some systems have started to incorporate PCIe 5.0, which doubles the bandwidth yet again. The PCIe 6.0 specification was finalized earlier in 2022, doubling the bandwidth once more. At the same time, users widely adopt NVMe and NVMe-oF, which makes it possible to take full advantage of the steady improvements in PCIe.
The storage industry is making important strides in delivering devices that support greater capacities. Enterprise HDDs now commonly exceed 20 TB, and SSDs exceed 30 TB. Tape continues to play a vital role in handling the increasing amounts of archival data. At the same time, storage is more intelligent and automated, and software-defined storage is more widespread. Data reduction technologies, such as compression and deduplication, continue to improve. Together, these advancements will lead to smarter tiering, more efficient storage management and better overall resource usage.
How well will emerging technologies make it?
The storage industry has generally been able to keep up with the demand for capacity and, barring any more unexpected events, should be able to do so for the next couple years. Beyond that, however, the future of data storage is not quite as certain, and much will depend on today's emerging technologies.
For now, meeting these demands will require the continued use of SSDs, HDDs, tape drives and perhaps optical discs. These formats still represent the most commercially viable storage technologies, whether they're used for on-premises systems or in hyperscale data centers. However, storage manufacturers must continuously improve and refine these technologies to carry us into the future.
In recent years, there have been many predictions about the demise of the HDD, but the reality is much different. The HDD market continues to thrive, and the technology achieves greater capacities than ever. Western Digital, for example, now offers a 26 TB HDD, and Seagate plans to release a drive that exceeds 30 TB sometime next year.
To achieve these greater densities, storage manufacturers are turning to newer technologies, such as helium-filled HDDs, shingled magnetic recording, microwave-assisted magnetic recording and heat-assisted magnetic recording. They're also actively researching NVMe HDDs, making it possible for HDDs to realize some of the benefits of NVMe, which could lead to greater bandwidth and the ability to transport larger amounts of data. This could also benefit HDDs with dual actuators, another emerging technology.
On the SSD front, manufacturers continue their quest to squeeze more bits per cell in order to increase density. The quad-level cell SSD market is steadily growing, and efforts are underway to produce penta-level cell SSDs. At the same time, manufacturers continue to add more layers to their SSD chips, substantially increasing storage density. Micron, for example, recently shipped the world's first 232-layer NAND chip, and SK Hynix plans to start mass-producing a 238-layer chip in early 2023. A growing number of enterprise SSDs also take advantage of PCIe and NVMe to maximize connectivity.
In addition, there has been a resurgence in tape storage, not only because it can be useful for offloading the enormous amounts of archival data, but also because of the protections it provides against ransomware. In 2021, the Linear Tape-Open program released the ninth generation of the LTO specification, paving the way for tapes with up to 18 TB of native storage and up to 45 TB of compressed data. IBM, HPE, Quantum and other companies now offer tape-related storage products that conform to the LTO-9 specification. The LTO program is also actively developing the LTO-10 specification, which is expected to support up to 36 TB of native storage and up to 90 TB of compressed data.
Despite the emergence of all these technologies, it is still unclear how well they'll be able to accommodate the projected onslaught of data in the years to come, even if they continue to be improved at their current pace. For this reason, manufacturers and researchers are also looking for long-term solutions, such as DNA storage, 5D crystal storage or holographic storage. But such technologies -- even if they do prove out -- might not be commercially viable for many years to come. In the meantime, manufacturers and their customers will need to keep pushing the limits of the established and emerging technologies and hope that there will be enough storage when they need it.