chris - Fotolia
Just six months after launching its 40-gigabyte A100 GPU, Nvidia has followed up with an 80-gigabyte version that delivers twice as much memory for data-hungry AI workloads and real-time analytics.
The new Nvidia A100 with HBM2 can be partitioned into seven separate GPU instances, each with 10 gigabytes of memory, thereby increasing secure hardware isolation and improving efficiency when running smaller, side-by-side workloads, Nvidia said. This technology allows just one A100 80-gigabyte instance to produce 1.25 times faster inference throughput with the RNN-T automatic speech recognition, according to the company.
While the 80-gigabyte GPU provides researchers and engineers with more speed for traditional scientific applications -- it delivers more than 2 terabytes per second of memory bandwidth -- the new A100 is also aimed at mainstream business applications, such as real time data analysis.
"Supercomputing has changed in some profound ways, expanding from being just focused on simulations to AI supercomputing with data-driven approaches that now complement traditional simulations," said Paresh Kharya, senior director of product management with Nvidia.
Kharya added that the company's refocused end-to-end approach is a necessary one if Nvidia is to continue along its current growth path. Nvidia currently has 2.3 million developers for a range of different platforms, although supercomputing is becoming increasingly important.
"They basically reengineered the whole chip to take it to the next level to better handle AI applications," said Frank Dzubeck, president of Communications Network Architects, Inc. "But what they are also doing is pushing supercomputers down into people's offices, trying to broaden the audience for it."
Nvidia unveiled the new system at this week's SC20 supercomputing conference.
Dell, HPE servers to pack 80GB GPUs
A handful of top-tier hardware vendors are betting that more traditional corporate IT shops can use the added processing power. Dell, Hewlett Packard Enterprise, Lenovo, Supermicro and Atos plan to ship systems incorporating both the four and eight A100 80-gigabyte GPU in the first half of 2021.
AMD also added some competitive heat to the GPU market with its 7nm Instinct MI100 GPU, that was also endorsed by Dell, HPE and Supermicro. The new GPU is the first iteration of the company's CDNA GPU architecture. With a throughput rate of 11.5 teraflops, the MI100 is the first GPU to break the 10 teraflops barrier, according to AMD.
Like the Nvidia GPU, the MI100 is built for AI and HPC workloads. It supports AMD's Infinity Fabric, which serves to increase peer-to-peer I/O bandwidth between cards and permit the cards to share unified memory with CPUs.
Given the laser focus corporate IT pros have on AI and machine learning applications, some analysts believe GPU chipmakers won't have too much difficulty working their way past the scientific markets and into the commercial markets
"The processing capabilities of GPUs makes them ideal for AI-related processes like machine learning," said Charles King, principal analyst with Pund-IT Research. "These latest [GPUs] will have significant roles to play in the next generation supercomputing installations."
Nvidia in particular sees corporate IT as the next step in its evolution, said Patrick Moorhead, president and principal analyst with Moor Insights & Strategies.
Patrick MoorheadPresident and principal analyst, Moor Insights & Strategies
"Nvidia is making the play to go directly to the enterprises with their ML training and networking [GPUs]," Moorhead said. "While the company has done this historically with the 'first movers,' it now wants to go after more mainstream enterprises with its converged ML infrastructure."
But all that glitters is not gold. Prestigious contracts with entities like the U.S. Department of Energy may not be hugely profitable for chip makers, King said.
Also, users can choose alternatives such as field-programmable gate arrays (FPGAs) and other types of performance accelerators for their AI and data analytics workloads.
Nvidia's data science workstation
Nvidia will also deliver this quarter a new petascale workgroup server, the DGX Station A100, to run data science workloads.
The new system, which delivers 2.5 petaflops when running AI-based workloads, has four Nvidia A100 Tensor Core GPUs. Those processors are tightly connected with Nvidia's NVLinkR giving users up to 320 of GPU memory. The system, working in concert with the Nvidia Multi-Instance GPU, can make available 28 separate GPU instances capable of running parallel jobs and supporting multiple users.
DGX Station doesn't require data center-class power or cooling, according to Nvidia. The system has the same remote management software as the Nvidia DGX A100, allowing users to carry out management tasks over remote connections whether workers are in the labs or their home offices, Nvidia said.
Like the A100 40 gigabyte GPU, the DGX Station is suited for a range of non-scientific applications including education, financial services, government and healthcare markets.
Nvidia is making the play to go directly to the enterprises with their machine learning training and networking system. While the company has done this historically with the "first movers," it now wants to go after more mainstream enterprises with its converged ML infrastructure.