DataDirect Networks has unveiled a massively scalable all-flash reference architecture for AI, based on Nvidia DGX SuperPod supercomputing clusters and new Nvidia Magnum IO software.
High-performance computing (HPC) storage specialist DataDirect Networks (DDN) and rival Dell Technologies each showed off AI-optimized storage integrations involving Nvidia graphics processing units at a supercomputing show this week.
The DDN-Nvidia package combines DDN AI400 all-flash NVMe storage and Nvidia DGX-2 graphics processing units (GPUs) in a scale-out architecture. Reference designs are expected to be generally available in 2020 from Nvidia resellers.
The design is based on Nvidia DGX SuperPod computing clusters. The DDN framework scales up to 64 Nvidia GPUs and 10 DDN AI400 arrays in a managed cluster. The AI400 is part of the DDN storage array family of A3I all-flash systems.
Magnum IO will enhance performance of the DDN storage system. Magnum IO, introduced this week by Nvidia, integrates software engineering from DDN, IBM and software-defined startups Excelero and WekaIO, along with connectivity from Nvidia-owned Mellanox.
DDN's contributions to Magnum IO involved engineering to simplify the path for IOs to talk to its distributed parallel file system. The work makes it easier to deploy clients into the file system without incurring a performance hit. Based on lab tests with Nvidia, DDN claims its AI400 SuperPod configuration can scale throughput up to 400 GBps.
DDN storage systems have been aimed at research labs and other highly dense compute environments. Fueled by storage acquisitions, DDN wants to capitalize on the commoditization of AI by selling more storage to mainstream data centers.
Supercomputing infrastructure usually takes months to deploy, but the Nvidia compute and DDN storage combination reduces deployment to hours, DDN senior director Kurt Kuckein said.
"We worked to abstract a lot of that difficult work in the A3I product line so that it's packaged optimally for the Nvidia architecture. That allows us to wheel things in and get it up and running very quickly," Kuckein said.
Dell's HPC storage news
Dell Technologies also expanded its HPC product line this week. The Dell EMC Ready Solutions line added two different turnkey products: one for BeeGFS, and another for media file storage vendor PixStor. Ready Solutions uses Dell EMC PowerEdge 14th generation servers, Dell networking and Dell EMC storage. BeeGFS file is an alternative to the Lustre file system, which is owned by DDN.
Dell introduced reference architectures for running AI platforms with partners DataRobot, Grid Dynamics, Iguazio and Red Hat OpenShift. Bills of material include Dell EMC converged infrastructure, data protection, servers and storage.
Dell added a new accelerator option for use with its Dell EMC DSS 8440 server with support for Nvidia T4 Tensor Core GPUs. Designed for a multi-tenant environment, the Tensor Core comes with up to 16 accelerators.
Dell also introduced support for:
- Nvidia Tesla V100s in PowerEdge servers for fast data transfer between InfiniBand and PowerEdge Express Flash PCIe SSDs;
- Nvidia RTX GPUs for performance improvements for rendering farms; and
- Intel field-programmable gate array cards for faster data interference in AI applications running on PowerEdge R740xd and R940xa servers.