3dmentat - Fotolia
The Institute for Health Metrics and Evaluation faced a challenge with its COVID-19 research that is not uncommon among nonprofit organizations that work with massive data sets.
Seattle-based IHME knew it needed to increase its storage capacity -- in its case, via Qumulo appliances -- to handle more petabytes of data. But the institute also had to make sure performance would be adequate, costs would be kept in check, and the impact to its data center footprint at the University of Washington would be minimized.
"We have limited space, so we have to maximize how much we can put in a rack," said Serkan Yalcin, IHME's director of IT, infrastructure and DevOps.
Based at the university's medical school and funded through grants, IHME analyzes worldwide health data and makes forecasts and tools freely available to help governments, hospital systems and policymakers make decisions on resource allocation. IHME conducts research on more than 300 diseases and risk factors, and the COVID-19 pandemic became a natural area of focus in 2020 while its other projects continued.
COVID-19 research adds data to heavy load
Yalcin estimated the COVID-19 research added 10% to 15% more data to the institute's already heavy load. IHME ingests raw data from outside sources and generates even more data as it creates models and visualizations based on best- and worst-cases scenarios. One example that has seen considerable worldwide use is IHME's projection of COVID-19 deaths depending on the percentage of the population that wears masks.
When Yalcin joined IHME in 2010, the institute's storage systems had about 5 TB of data. Yalcin said IHME generated at least 500 TB of strictly COVID-19-related data since last summer, and with each month that passes, it generally creates more data than the prior month.
IHME has used scale-out file storage from Qumulo since 2014, when it switched from Quantum's StorNext at a time when storage capacity was nearly doubling every six months. Yalcin said Qumulo offered advantages with monitoring and managing billions of files. He could click a folder and, within seconds, see the source, the number of files, the capacity and usage trends over the last 72 hours or 30 days.
"To that point, I'd never seen any file system that gave you as much insight as Qumulo does," Yalcin said. He left IHME in August 2015 to work as a customer success manager for Qumulo for eight months. Yalcin returned to IHME in 2017, after working as director of customer support and IT at Maana, a computer software firm.
With its data load escalating last year, IHME reached out to Qumulo to check out the latest options for a new platform that could pack in the most data per rack and deliver the best performance it could afford. Qumulo sells both all-NVMe flash systems and hybrid models that combine solid-state drives (SSDs) and cheaper hard disk drives (HDDs).
All-flash won't work economically for IHME
Even though flash SSD prices have been dropping with the latest quad-level cell (QLC) 3D NAND technology, Yalcin said all-flash systems would not make economic sense for IHME over hybrid models because the institute derives no revenue from its work.
Qumulo's hybrid systems sell for about a third of the price of the more energy-efficient all-flash systems, according to Ben Gitenstein, the company's vice president of product management.
So, IHME opted for the Qumulo C-432T hybrid platform with 432 TB of raw storage capacity, using NVMe-based 3.2 TB SSDs to cache data in front of the largest available 18 TB HDDs from Western Digital. The data lands on the NVMe SSDs first, and the hottest data stays on the flash drives to speed access. Colder data goes to the slower HDDs.
Hybrid system performance
Gitenstein said Qumulo's hybrid systems deliver comparable performance to the all-flash models, with 90% to 95% of the data reads generally coming from the flash cache. But customers need to consider the risk that the hybrid system might not read the data from the flash cache and would instead deliver disk performance, he said.
Yalcin said the Qumulo appliances meet the university's performance needs. He said IHME achieves 1 million IOPS with its four older Qumulo racks and now gets the same performance with a single rack of the new C-432T nodes. That translates to cost savings, because IHME rents the racks from the university.
The institute's new 12-node Qumulo cluster equipped with Western Digital's Ultrastar DC HC550 18 TB HDDs and Ultrastar DC SN640 3.2 TB NVMe SSDs offers a significant density improvement over its older systems. The institute's 24-node and 14-node Qumulo clusters use 8 TB HDDs and 480 GB SSDs. IHME also has a 15-node cluster with 10 TB HDDs and 480 GB SSDs, and a smaller 8-node cluster with 2 TB NVMe SSDs to cache container images, code and temporary files.
IHME manages 42 racks at the university's data center: about 80% for high-performance computing and 20% for storage. The addition of three new Qumulo racks gave IHME an additional 5 PB of available storage capacity, Yalcin said.
"It was very clear that our need for storage was never going to go away. Now we have close to 9 PB of storage available just from Qumulo, and we utilize 80% of it," Yalcin said. "They were a huge part of our growth and success in that area. It was a partnership to have a unified platform to manage the data and ease the pain."