your123 - stock.adobe.com
Nvidia took aggressive steps to strengthen its position in the AI supercomputing market this week, outlining plans to deliver multiple systems by year's end that aid developers and users in creating and deploying AI-based applications faster.
The centerpiece of the announcements is a large-memory Nvidia DGX system using the company's new GH200 Grace Hopper Superchip, which is tightly coupled with Nvidia's NVLink Switch System. The new system is purpose-built to create next-generation models for generative AI language applications as well as recommender systems and data analytics workloads.
By coupling the NVLink interconnect technology with the NVLink Switch, the system is able to link up to 256 GH200 superchips and permit them to act as a single GPU. This lets the system provide 1 exaflop of performance and up to 144 terabytes of shared memory, or about 500 times more memory than the previous generation of DGX A100 GPUs.
In his keynote at the Computex 2023 conference in Taiwan over the weekend, Nvidia CEO Jensen Huang said that generative AI, large language models and recommender systems are "the digital engines of the modern economy." He claimed machines such as the DGX GH200, with its added speed and networking capabilities, will "expand the frontiers of AI."
Jack Gold, an analyst at J. Gold Associates LLC, said, "The issue most supercomputers face is a lot of the compute limitations they have has to do with bandwidth restrictions. Communicating among many chips, or from chips to memory, can slow overall performance. So anything you can do to increase the bandwidth among all those connections can be a huge benefit to how your system is going to perform."
Nvidia said Google, Meta and Microsoft will be the first to have access to GH200, primarily to explore potentially new capabilities for generative AI workloads. AWS will offer the DGX GH200's design via the Nvidia MGX server specification, a modular reference architecture to help other manufacturers and cloud providers build as many as 100 server variations that support a range of AI-based high-performance computing and Omniverse applications.
Software bundled with the system includes Nvidia Base Command providing AI workflow management, enterprise-class cluster management, a number of libraries that help accelerate compute, storage and network infrastructure and other system software tuned to run AI-based workloads. Also included is Nvidia AI Enterprise, a software layer supplying developers and users with 100 frameworks, pretrained models and an assortment of development tools designed to simplify the deployment of AI applications into production environments.
"Adding essentially starter models can be a big deal for many companies that don't have the money to build their own [AI] models from scratch," Gold said. "That could take some shops months to train a large AI model, and the associated costs can be monstrous."
Nvidia announced a second AI supercomputer, Helios, made up of four DGX GH200 computers that will be used internally by Nvidia development and research teams. Each of the four GH200 systems will be connected to the Nvidia Quantum-2 InfiniBand network and used to train large AI models.
Jack GoldAnalyst, J. Gold Associates
"This system [Helios] is a research cluster that was built at the Center for Scientific Computing," said Ian Buck, vice president of accelerated computing at Nvidia. "It will be used for climate, weather material science for genetic research and other massively complicated complex scientific problems that are important for the National Computing Center of Taiwan."
Helios is expected to be up and running by the end of this year, Nvidia said.
Nvidia unveiled plans for a third AI supercomputer for Israel-based researchers. The system, called Israel-1, will deliver up to 8 exaflops of AI computing and will be partly operational by year's end, Nvidia said.
In a related announcement, Nvidia and SoftBank Corp. said they are working jointly on a new platform for generative AI as well as for 5G- and 6G-capable applications. The platform will use the MGX reference architecture and leverage Arm Neoverse-based GH200 Superchips.
SoftBank plans to roll out the platform to a new AI data center in Japan. SoftBank said it will build the upcoming data centers in concert with Nvidia to host AI applications and services on a multi-tenant common server platform.
Supermicro and Quanta Cloud Technology said they hope to be the first to market with systems based on the MGX design this August. Systems from each company will contain the GH200 Grace Hopper chip.
As Editor At Large with TechTarget's News Group, Ed Scannell is responsible for writing and reporting breaking news, news analysis and features focused on technology issues and trends affecting corporate IT professionals.