- Marc Staimer, Dragon Slayer Consulting
Technologies mean little in a vacuum. Data storage technologies are no different. There is little value to a technology that solves a $1,000 problem at a cost of $1 million.
Whether a new technology is successful still comes down to the problem it solves, its net value and if it is compellingly better than what is used today. From that perspective, here are data storage trends to consider for 2022.
The battle for data storage performance supremacy will escalate
Our first data storage trend is the continued need for lower latencies to drive down application response times. Lower latencies affect time-to-job-completion and lead to higher productivity, faster time to actionable insights for databases, faster time to market, greater market share and faster time to revenues. Reduced last byte latencies are very important for high-performance computing message passing interface applications.
Greater IOPS drives storage and database consolidation. Fewer database servers means fewer cores, which reduces infrastructure costs while increasing performance. It also enables more application development at much lower costs and speeds up increasingly popular blockchain applications.
The need for faster analytics, machine learning, deep machine learning and AI neural networks has organizations constantly asking for better throughput. That converts into faster time to actionable insights, time to market and time to first revenues.
Data storage performance is supported by the rapid adoption of several much higher performance data storage technologies. For example, the use of NVMe-oF has steadily grown in many data storage systems and has three primary flavors: InfiniBand, Fibre Channel and RDMA over Converged Ethernet. But these options commonly require new switching and network interface card (NIC) infrastructure. NVMe/TCP is the most recent option, and it does not require the networking infrastructure to be upgraded because it runs on top of TCP/IP running in standard Ethernet networks. It's also the easiest to implement. Unsurprisingly, it seems to be the most popular choice among CIOs, CFOs and IT pros in general.
Other technologies organizations are adopting to support data storage performance include the following:
- The swift deployment of PCIe Gen 4, and in some cases Gen 5, in data storage controllers and flash SSDs. PCIe Gen 4 has double the bandwidth of Gen 3, while Gen 5 has double the bandwidth of Gen 4.
- The use of higher bandwidth NICs, ranging from 100 Gbps to 400 Gbps, that have significantly lower latencies. The highest bandwidth NICs demand PCIe Gen 4 or Gen 5.
- The use of data processing units (DPUs) that offload resource-intensive processes such as NVMe-oF, routing and even switching to greatly accelerate data storage performance. DPUs are appearing on high-bandwidth NICs and some data storage systems.
- The implementation of faster NVMe flash SSDs that employ other non-volatile memories in their controllers. These reduce drive latencies and increase IOPS and throughput.
- The use of higher performance storage controllers with the latest CPUs from AMD, Arm and Intel.
Performance is a bit addictive. But when is it enough? The answer is, in reality, never. As new levels are reached, complacency sets in. Applications then take advantage of these newer performance levels and ultimately demand even greater performance. This is a never-ending data storage battle that appears to be speeding up in 2022.
Data storage cyber resilience becomes table stakes in the industry
This data storage trend is a direct result of the evolution of ransomware. Ransomware has become quite good at deleting or corrupting data backups, storage snapshots and replicas. It can also change retention policies, secure erase repositories or simply delete directories. It is an insidious form of theft.
Data storage system vendors have responded by adding immutable storage to make the volume, file system or object bucket holding the backed-up data unchangeable for a policy-defined retention period. Others have added multistep, multifactor authentication for any changes to policies affecting the data. This prevents the ransomware from effectively neutering the backed-up data. It is not foolproof, nor should it be the only cyberdefense used. And yet, it is another layer of defense that makes it a bit more difficult for cybercriminals to succeed. It appears likely to become data storage system table stakes in 2022.
Unstructured data management will be a data storage game changer
Some readers may be scratching their heads, asking, "What the heck is unstructured data management?" A good way to think about it is the ability to add useful management structure and flexibility to unstructured data. It basically provides database-like management, search, queries and control of unstructured data.
Unstructured data management isn't new. Schema optional databases -- colloquially known as NoSQL databases even though SQL can be used -- can be document- or object-based and provide management to unstructured data. However, the emergence of self-governing AI machine learning or autonomous data management affects data storage directly.
Data storage vendors have implemented unstructured data management for several years. The problem is that their implementations are data storage centric, not data centric. It tends to be limited to the vendor's own data storage systems coupled with some integration with S3-compatible object storage. It is rarely multivendor and locks a customer into the vendor's data storage systems. If a customer doesn't mind paying more and having no data storage choices going forward, it works reasonably well. This is a key reason why unstructured data management hasn't become a major data storage trend to date. But that's about to change.
A new wave of unstructured data management products has emerged over the past few years that abstracts it from the data storage. Some products sit out of the data path, others in the data path and some are a blend of the two. They are data storage system- and vendor-agnostic. They can archive, copy, move and delete original data from primary storage to S3 object storage or even tape, usually without hierarchical stubs. Some products do this with symlinks, others with a global namespace. Some collect, harvest, parse and manage metadata. Others do not.
The systems are incredibly scalable -- into the hundreds of petabytes and even exabytes. Best of all, they don't need to migrate data from the current file or object storage: They discover and map the data where it exists. Then they archive it, move it to the data storage required for specific data and make copies for different geographic locations. And organizations can justify the costs of these unstructured data management systems.
What makes this new breed of unstructured data management so promising is how it changes data storage purchase decisions. Data storage systems can come from different vendors so organizations can use the best storage for a given application. Based on its value and lifecycle, data moves not to a separate tier in an expensive data storage system, but to a lower cost data storage system. It abstracts not just the unstructured data management, but many data storage system services like lifecycle management, data protection and replication. Replication alone empowers data copies to different data storage systems, types, vendors and media.
These next-generation unstructured management systems from vendors such as 22dot6, Aparavi Software, Datadobi, Data Dynamics, Hammerspace, iXsystems, Quantum, Spectra Logic and StrongBox Data Solutions are likely to be major data storage trends in 2022.
Cloud-like elastic on-demand pricing for data storage systems
It's no secret that public cloud storage has commanded an increasing share of the data storage market. Elastic on-demand fees only get charged after an organization uses the storage capacity. It puts the risk on the cloud storage provider because the customer does not have to buy data storage ahead of time to meet their unknown needs as time goes on. It generally costs more, but not always.
The best way for data storage systems vendors to compete more effectively is to provide this same pricing service on premises. Dell, HPE, Infinidat, NetApp and Pure Storage all have these cloud-like elastic on-demand pricing services. Expect several more in 2022.
This is an obvious win-win for the customers and the vendors. Win-win situations almost always turn into trends.
Check back next year to see how accurate these predictions have been.