michelangelus - Fotolia

Exascale computing, now at $2.8B, described as critical to U.S.

The race to build exascale supercomputers is a project of big governments -- China, Japan, the European Union and the U.S. Initial systems are planned in the 2020 to 2021 time frame.

The U.S. is spending $1.1 billion on two exascale computing systems, and Cray Inc.'s technology is on both of them. The U.S. is spending even more on exascale R&D. These systems are so expensive that only a government can afford them. They are considered critical to national defense and U.S. competitiveness.

That's what makes Hewlett Packard Enterprise's planned $1.3 billion acquisition of Cray of particular importance. In the scheme of big tech acquisitions, the purchase price is small potatoes. But it's happening in a sector of importance to U.S. strategic objectives.

The U.S. mainly relies on four U.S. supercomputer makers -- Cray, Hewlett Packard Enterprise (HPE), IBM and Dell Technologies -- to deliver high-performance computing (HPC). The U.S. government operates six of the 10 most powerful supercomputers in the world today -- three are by IBM and three by Cray, according to the Top500, a list of the most powerful known supercomputers in the world. The list is updated twice a year and includes the global leader, Summit, a 200 petaflop IBM system at Oak Ridge National Laboratory.

Exascale is a thousandfold increase in computing power from petascale, which was first reached in 2008. It can reach 1 quintillion -- that's a one followed by 18 zeros -- calculations per second. The cost of these systems keeps them out of reach for commercial buyers, and that keeps the HPC market small. But the size of the market belies its importance.

Cray is building the Frontier exascale computing system. This $600 million system will weigh about 1 million pounds. It will take up about a half-acre of data center space at Oak Ridge National Laboratory. It needs around 30 megawatts of power to operate.

A second exascale computing system, Aurora, will be at Argonne National Laboratory in Lemont, Ill. Cray's technology will be on this system, as well. When the U.S. boots up the systems in 2021, a new era in high-performance computing will begin.

R&D is done virtually

Products such as new drugs, engines and airplanes are developed on HPC systems virtually. Hurricane forecasters, drugmakers or engineers designing an aircraft carrier will be able to develop and test in real time or near-real time, said Jeff Nichols, associate lab director at the Oak Ridge National Laboratory in Oak Ridge, Tenn.

"It'll give you a faster turnaround time, but may also increase the amount of physics that you can put into the model," Nichols said. This will produce models that can look at problems, such as an approaching hurricane, in fine granularity, he said.

Steve Scott, CTO of Cray Inc.Steve Scott

The increases in computing power bring new capabilities to researchers, said Steve Scott, CTO of Cray, based in Seattle. "Every single area of science has problems that they cannot solve without another order of magnitude of computing," he said.

The Frontier and Aurora systems are using Cray's architecture, Shasta, and interconnects. Frontier will use AMD chips. In the Aurora system, Cray is a subcontractor to Intel.

Globally, only $5.4 billion was spent on supercomputers in 2018 and less than $14 billion on HPC systems overall, according to Hyperion Research. That's maybe only less than 1% of the multitrillion-dollar global IT spend, but supercomputing gets attention at the highest level of government.

The global race to exascale

The U.S. is in a global race with China, Japan and Europe to build the most advanced systems. America's two exascale computing systems are due in 2021. China and Japan have set 2020 for their systems, and Europe has one planned, as well.

"National security requires the best computing available, and the loss of leadership in HPC will severely compromise our national security," the National Security Agency and Department of Energy wrote in a joint report, which was released at the end of 2016, just before then-President-elect Donald Trump took office. The report appeared after China produced the world's most powerful supercomputer using its own homegrown chip technology.

The report urged a surge of investment: "HPC resources are required for the development of a variety of military, scientific, and industrial capabilities. Loss of a U.S. leading position would threaten our ability to compete internationally in all of these fields."

"The Chinese government makes sure that their firms attain scale so they can gain global leadership," said Robert Atkinson, president of the Information Technology and Innovation Foundation, a nonpartisan research group in Washington, D.C.

Atkinson said he believes the HPE-Cray merger will be good for the U.S. industry and competitiveness, "especially since the U.S. government is relatively miserly on its support for HPC R&D compared to China."

Jack Dongarra, one of the academics behind the Top500 list and director of the Innovative Computing Laboratory at the University of Tennessee, said in an email the market "can't sustain a number of companies, and a merger like this makes sense."

The U.S. has budgeted $1.7 billion on the Exascale Computing Project for R&D of underlying exascale technology. It is spending $500 million on Aurora. The $600 million budget for Frontier includes about $100 million for additional nonrecurring engineering.

Combined budget of $2.8 billion

I'm optimistic that we'll find another technology after CMOS that'll get us to zettascale, but we just don't know what it is yet.
Steve ScottCTO, Cray Inc.

That brings the combined exascale spending to $2.8 billion. But the U.S. has a third exascale computing system in the works, El Capitan, that may arrive after 2021.

This type of spending "is now what's necessary to achieve this level of performance," said Bob Sorensen, vice president of research and technology at Hyperion.

"Buying a leadership-class HPC is like hosting the Olympics nowadays," Sorensen said. "Only the highest-end countries are going to be able to play the game."

The Cray Shasta architecture was designed for multiple processor types and can run converged workloads, such as modeling, simulation, analytics and AI. Support for different types of workloads is necessary to get enough people behind big-budget HPC projects, Sorensen said. It makes these machines harder and more expensive to build.

When exascale computing is achieved, attention will shift to zettascale. But how zettascale -- a 1,000 exaflop system -- is reached is not yet known.

Is zettascale even possible?

Complementary metal-oxide-semiconductor (CMOS) technology that allowed for regular gains in computing power is nearing its end, according to Scott.

For decades, the U.S. saw a thousandfold increase in computing power roughly every decade, he said.

"We will never get to a zettaflop computer with CMOS technology like we've used for the past few generations," Scott said. "We will fundamentally need to have a new computing technology."

In a paper published in late 2018, Chinese computer science researchers argued it may be possible to reach zettascale by 2035. The paper outlined numerous challenges, including what could provide the processing -- quantum computing, some type of biological computing or optical computing that may use lasers.

Scott said there are number of candidate technologies, but "none of them have been proven, so there is not guarantee that we will even get a zettaflop computer."

Despite the unknowns, Scott said he believes zettascale's problems will be solved. To make his point, he rattled off the major technology advances over the last 100 years. These "five different discontinuities" range "from mechanical switches to electromechanical relays to vacuum tubes to discrete transistors to integrated circuits, and we've kept on this exponential growth over time," he said.

"I'm optimistic that we'll find another technology after CMOS that'll get us to zettascale, but we just don't know what it is yet," Scott said.

Dig Deeper on Data center ops, monitoring and management

Cloud Computing
and ESG