If you're looking to deploy AI in your data center, carefully consider what hardware and infrastructure to invest in first.
AI covers a range of techniques, such as machine learning and deep learning. And AI includes a broad range of business applications, from analytics capable of predicting future performance to recommendation systems and image recognition.
As more large businesses adopt artificial intelligence as part of digital transformation efforts, AI continues to expand and develop as a technology. Understanding why your business requires AI can also help you decide which infrastructure to adopt in order to support it.
Servers with GPUs
Equipping servers with GPUs has become one of the most common infrastructure approaches for AI. You can use the massively parallel architecture of a GPU chip to accelerate the bulk floating-point operations involved in processing AI models.
GPUs also tend to have broad and mature software ecosystems. For example, Nvidia developed the CUDA toolkit so developers can use GPUs for a variety of purposes, including deep learning and analytics. However, although GPUs support certain deep learning tasks, they do not necessarily support all AI workloads.
"There are models within the context of AI and machine learning that don't fall into this neat category of deep learning and have been underexplored because the GPU is very good at neural network type stuff, but it isn't necessarily good at some of these other interesting flavors of algorithms that people are starting to do interesting things with," said Jack Vernon, analyst at IDC.
Before deploying AI in the data center, you should start by considering your motives for adopting the technology to decide whether GPUs suit your requirements. Then, seek a specialist's advice on the kind of model that best fits your organization's requirements to understand what other infrastructure you require.
Other hardware accelerators
Field-programmable gate arrays (FPGAs) are essentially chips crammed with logic blocks that you can configure and reconfigure as required to perform different functions. ASICs have logic functions built into the silicon during manufacturing. Both accelerate hardware performance. ASICs make more sense for organizations with a large volume of well-defined workloads, whereas FPGAs require more complex programming.
Google offers its TPU -- an ASIC designed specifically for deep learning -- to customers through its Google Cloud Platform. Graphcore designed its IPUs specifically for AI workloads, and Cambricon offers processor chips designed around an instruction set optimized for deep learning. Intel's acquisition Habana Labs makes programmable accelerators as separate chips for the training and inference parts of deep learning known as Gaudi and Goya, respectively.
Although GPUs and similar types of hardware accelerators get the most attention when it comes to AI, CPUs remain relevant for many areas of AI and machine learning. For example, Intel has added features to its server CPUs to help accelerate AI workloads. The latest Xeon Scalable family features Intel Deep Learning Boost, which features new instructions to accelerate the kind of calculations involved in inferencing. This means that these CPUs can accelerate certain AI workloads with no additional hardware required.
Storage for AI
Organizations should not overlook storage when it comes to infrastructure to support AI. Training a machine learning model requires a huge volume of sample data, and systems must be fed data as fast as they can take it to keep performance up.
"Storage is a really big thing, and the training process itself often involves feedback loops. So, you need to essentially save the model in one stage, run some processing on top of that, to update it, and then sort of continuously recall it," Vernon said. "Most organizations that are building out training and inferencing infrastructure often quickly have a massive requirement for additional storage."
Organizations with existing HPC infrastructure often already have a fast flash storage layer back-ended by a much larger capacity layer. For most organizations, this means implementing NVMe SSDs with as low latency as possible, backed by less costly storage to deliver the capacity.
Specialized AI systems
Several specialized systems offer higher performance for AI workloads. Nvidia bases its DGX servers around its GPUs, with an architecture optimized to keep those GPUs fed with data. Storage vendors have also partnered with Nvidia to provide validated reference architectures that pair high-performance storage arrays with Nvidia DGX systems. For example, DDN optimized its Accelerated, Any-Scale AI portfolio for all types of access patterns and data layouts used in training AI models, and vendors such as NetApp and Pure Storage offer similar storage architectures.
Intel offers its OpenVINO toolkit as an inferencing engine designed to optimize and run pretrained models. This has a plugin architecture that enables it to execute models on a range of hardware, such as CPUs, GPUs, FPGAs or a mixture of all three, which gives organizations greater deployment flexibility.
You might also elect to build and train your AI models in the cloud, using on-demand resources they can discontinue once training is finished.