Building AI foundation models to accelerate digital pathology
Mayo Clinic has deployed NVIDIA's Blackwell-powered DGX SuperPOD to accelerate AI foundation model development for digital pathology, enhancing diagnosis accuracy and treatment.
With the advent of generative AI, health systems are racing to integrate the technology into various aspects of the clinical care continuum. While the most popular genAI use case appears to be clinical documentation, Mayo Clinic recently entered a partnership that would enable it to create new foundation models for digital pathology built on NVIDIA's Blackwell GPU infrastructure.
The partnership, announced at the end of July, involves Mayo Clinic deploying the NVIDIA DGX SuperPOD with NVIDIA DGX B200 systems, an infrastructure that provides AI compute capabilities supported by the Blackwell advanced GPU microarchitecture.
Mayo Clinic plans to use the Blackwell infrastructure to accelerate its foundation model development efforts in digital pathology and beyond, Matthew Callstrom, M.D., Ph.D., who leads Mayo Clinic's GenAI program, shared with Healthtech Analytics. Still, despite the advantages that the Blackwell infrastructure offers, there are enduring challenges to foundation model development, including ensuring data availability and accuracy.
EXPLORING FOUNDATION MODELS IN DIGITAL PATHOLOGY
Although digital pathology is not a new concept, advances in health IT have significantly enhanced its ability to support clinical diagnosis and disease treatment.
Jim Rogers, CEO of Mayo Clinic Digital Pathology, explained that digital pathology improves upon manual pathology processes in numerous ways. The latter typically involves the pathologist conducting H&E staining on a glass slide and then examining it under a microscope to understand what is occurring within the patient's body. Digital pathology replaces the analog approach with high-resolution digital images of the glass slides.
Now, AI foundation models have the potential to further enhance digital pathology by reducing discordance among pathologists.
"That, we think, can make a dramatic difference in how people are diagnosed, how people are treated," Rogers said. "The reason being that if AI can help us by looking for patterns, it could determine that the diagnosis should be this versus that. And right now, we rely on pathologists who are very good, but even with very good pathologists, you have a fairly high discordant rate, meaning that you could put a complex case in front of them and they would disagree. The hope would be that AI could help really bring that discordant rate down and allow us to be in a position to more accurately diagnose."
Not only that, but AI-supported digital pathology could enable more personalized medicine. Using cancer as an example, Callstrom explained that pathology slides contain a lot of information about the tumor cells and the characteristics of one tumor versus another.
"The generative [AI] component here is where it really gets exciting," he said. "So you can start to pair the slide against the treatment and the outcome, and now you can start to use that predictive component to train a model against an unstained slide, which could show you what to expect in terms of treatment response from a patient."
Furthermore, building foundation models for digital pathology can serve as an accelerator for developing other AI solutions without requiring access to large datasets.
We're incredibly optimistic that these new tools will reduce the administrative burden for the work that our clinicians are doing today.
Matthew Callstrom, M.D., Ph.D.Medical director of the department of strategy and leader of Mayo Clinic's GenAI program
Take Mayo Clinic's Atlas Foundation model, for instance. Created through a collaboration between Mayo Clinic, Aignostics and Charité, the model is based on more than 1 million histopathology whole-slide images from Mayo Clinic and Charité, Rogers said. It achieves high performance across 21 public benchmark datasets.
"And now using that model, we can now quickly identify and develop additional algorithms and models as opposed to having to get access to a large dataset every time we want to try to do something," Rogers said.
HOW THE NVIDIA INFRASTRUCTURE WILL SUPPORT MODEL DEVELOPMENT
Building foundation models for digital pathology has the potential to transform the field; however, developing them requires a robust infrastructure for model development.
This is where NVIDIA's Blackwell-powered DGX SuperPOD platform comes in. The full-stack data center platform provides computing, storage, networking, software and infrastructure management.
According to Callstrom, this infrastructure will allow the health system to build foundation models more efficiently.
"You can do it faster, more energy efficient [and] it can take on larger data sets as you do that work," he said. "So it gives you an expanded capability and also allows you to move faster with that capability."
Rogers echoed Callstrom, adding that the platform's computing power is immense, enabling model development efforts to skyrocket.
"It's like having rocket fuel for your development efforts," he said. "It allows us to go very, very quickly -- what could take months before, could now take weeks or days, depending on what we're trying to build."
Additionally, partnering with NVIDIA to deploy the platform on-site offers a greater level of flexibility. Rogers noted that having access to the platform allows the health system to experiment with creating different models, even if not all of them pan out. The health system also plans to use the NVIDIA platform to further train and improve the Atlas model.
CHALLENGES TO FOUNDATION MODEL DEVELOPMENT
Although Callstrom and Rogers expect the NVIDIA platform to significantly enhance foundation model development for digital pathology, there are still overarching challenges that the health system must contend with.
It's like having rocket fuel for your development efforts. It allows us to go very, very quickly -- what could take months before, could now take weeks or days, depending on what we're trying to build.
Jim RogersCEO of Mayo Clinic Digital Pathology
One significant challenge is the availability of data. Callstrom explained that to build a foundation model, you first need data to be available in certain, standardized forms.
"So that means, if it's digital pathology, you need the digital slides, and we now have 20 million slides that we've digitized so that we can use that content to build models," he said. "We've also done the work to protect patient identity, so we've de-identified the data."
However, de-identifying and standardizing data becomes more complex when various organizations come together. For instance, the development of the Atlas model involved data from Mayo Clinic as well as Charité, a healthcare organization in Germany. While this diversity of data helps improve model performance, the organizations had to work together to gather and standardize the data.
Data accuracy is the other, ostensibly harder, aspect of data management when building foundation models. The promise of foundation models is immense, but for the models to achieve that promise, accuracy is paramount.
"You have to make sure you use things like retrieval, augmented generation, and link back to source content, and also make sure you don't have errors of omission," Callstrom said. "And you have to make sure that you can validate the performance of these models in a healthcare setting."
With the Blackwell infrastructure, Mayo Clinic hopes to accelerate foundation model development beyond digital pathology as well. Imaging, genomics and disease risk prediction are additional areas of interest.
Ultimately, Callstrom believes that the health system's focus on developing foundation models and doing so quickly and efficiently will benefit the most important stakeholders in healthcare -- the providers and patients.
"We're incredibly optimistic that these new tools will reduce the administrative burden for the work that our clinicians are doing today," Callstrom said. "The manual work that's done in the background to be able to understand a patient and treat them well is considerable. We think we can reduce that burden with these tools and give bandwidth back to our clinicians to be able to think, not just do the manual work. And when you think about unlocking data [with AI foundation models], we could learn more about the best treatment options for patients and predicted outcomes for patients. That would be a huge impact and benefit to patients."
Anuja Vaidya has covered the healthcare industry since 2012. She currently covers the virtual healthcare landscape, including telehealth, remote patient monitoring and digital therapeutics.