Today's AI workloads need storage systems that deliver the performance, capacity and availability necessary to ensure reliable operations throughout the application's lifecycle. AI technologies, such as machine learning, deep learning and predictive analytics, require AI storage systems that can deal with the vast amounts of diverse data they generate, along with the fluctuating and process-intensive workloads.
What follows is a look at eight factors to consider when planning storage for your AI workloads.
1. Workload patterns
AI storage requirements vary significantly from one application to the next. They generate different quantities of data and have a variety of access requirements and I/O patterns. For example, a deep learning application might need to access and process data more frequently than a basic machine learning one, while at the same time continuously add data to the existing pool. You must thoroughly understand each workload's storage requirements now and in the future and don't assume any two workloads are alike.
But understanding those requirements is no small matter. A typical AI application goes through several stages of operation, and storage requirements can vary from one stage to the next.
For example, during the ingestion stage, vast amounts of heterogenous data are collected and saved to disk, usually as sequential write operations. But during the transformation stage, when data must be cleansed, organized and transformed, fluctuating amounts of data are read and written, incurring both random and sequential operations.
2. AI storage scalability
An AI application needs lots of data. The more data available to the AI application, the more accurate its results. And that data can come from a variety of sources and in a wide range of formats. True, some AI applications require less data than others, but you must still factor in capacity and scalability requirements. Be sure to consider the need to copy, move, aggregate or in other ways manipulate and process the data.
All this storage can represent a significant investment, whether in data center systems or cloud-based services. It can be expensive to handle the storage needed on premises, especially using high-performing flash arrays, yet farming it all out to the cloud isn't always the best alternative.
One way to keep costs down and still meet scalability requirements is to use both flash and hard-disk storage, rather than relying solely on flash. Another option is to implement a hybrid or multi-cloud strategy. The challenge with this approach, however, is that you must carefully regulate the amount of data you're copying or migrating across platforms, and you must keep in mind distances. Otherwise, data duplication or migration costs could undermine the advantages of a cloud strategy.
3. Data durability
For some AI applications, the amount of data isn't the only consideration. You also must look at how long you need to keep that data. Some applications require ongoing analytics that continuously infuse new data into the old, a process that can span years, resulting in enormous stockpiles of information. To ensure the data is going to be around for the duration, you need comprehensive backup and disaster recovery strategies, in addition to heaps of storage capacity.
When evaluating your AI application's workload patterns and scalability requirements, be sure to account for issues such as how long you must hang on to the data, how the data will be accessed going forward, what data can be archived, when it can be archived and, of course, the amount of data that needs to be stored throughout the entire lifecycle.
4. System performance
An AI solution collects, processes, aggregates, trains and analyzes data. To carry out these operations against massive data sets, AI storage must be fast and efficient, able to deliver the necessary throughput and I/O rates, while reducing latency and contention. If the storage system isn't built and optimized to meet these demands, you might be looking at weeks to complete a single iteration of the data training phase.
Today's AI products often run on high-performing, GPU-based compute systems. The storage platform must keep up with these systems to make the investment worthwhile. That means, among other things, avoiding I/O bottlenecks and performance issues. A massively parallel storage architecture is one way to achieve these AI storage goals, especially when training data, which puts heavy demands on compute and storage systems alike.
5. Data locality
The location of your data plays a role in efficiently processing massive volumes. The nearer the data is stored to where it's processed, the more efficient the operations. Organizations that process and store data in their own data centers or on a single cloud platform have an advantage. Organizations that use hybrid and multi-cloud strategies could have a tougher time, undermining some of the advantages that come with cloud strategies. To implement an effective AI solution, you must minimize latencies, and distance can be one of the biggest contributors to latency.
6. Storage type
Another consideration when implementing AI workloads is how data will be stored. Object data storage is the most common approach. It has the advantage of supporting extensive sets of metadata. Storing metadata along with the actual data makes it possible to describe the data in multiple ways, which, in turn, enables faster and easier searching, an important consideration with AI analytics. In addition, object storage is fast, flexible, space-efficient and highly scalable, making it an ideal match for AI workloads.
7. Continuous optimization
Any AI storage system must be continuously optimized to maximize performance and minimize latency. Today's intelligent storage can go a long way in helping to keep systems optimized. An intelligent storage system, which itself uses AI technologies, can uncover patterns in the metric data collected from the storage systems, as well as from other systems in the environment. From these patterns, the intelligent system can automatically resolve issues and optimize storage performance, without human intervention.
Another trend that can benefit AI workloads is software-defined storage (SDS), a systems architecture that decouples storage software from the hardware. By abstracting the physical storage resources, SDS provides greater flexibility, simplifies management and automates operations, while optimizing storage performance, all of which will benefit AI workloads.
8. Cross-platform integration
No system or application exists in a vacuum. Data almost always originates from multiple sources -- sometimes a significant number of them -- and is often stored in numerous locations. Hybrid and multi-cloud strategies only add to the mix, as do technologies such as edge computing, IoT and hyper-converged infrastructures.
No matter how your data moves or where it's stored, you must ensure all systems seamlessly integrate with one another to minimize deployment and maintenance efforts, as well as potential bottlenecks. Wherever possible, use standards-based technologies to help with this process.
Get your data center ready for AI
Best practices to enhance storage for AI
Ultra-thin memory storage could improve AI applications