The healthcare industry has seen a surge in data in recent years, especially due to the increased adoption of electronic health records after Congress passed the HITECH Act in 2009.
Medical images make up much of the healthcare data currently available, and the volume of data is projected to grow to 2,413 exabytes -- there are 1 billion gigabytes in 1 exabyte -- by 2020. Healthcare organizations are thus faced with the tasks of not only storing that data, but also analyzing it for patient care so it doesn't just sit dormant.
Medical imaging data contains a wealth of information that can be used to enable modern healthcare approaches like precision medicine and population health. But because medical imaging data sets are large -- in some cases 10 GB or more -- healthcare organizations must store them in a way that allows providers to access the most recent data first -- and fast.
Setting storage priorities
One issue organizations often run into is that medical imaging data isn't always stored chronologically. It can also be difficult to determine which storage technology to use. In a clinical setting where patient information helps physicians make treatment decisions, the most recent data is going to be the most valuable, said J. Gary Seay, a principal adviser at BrightWork Advisory, a healthcare IT consultancy in Eagleville, Tenn. Seay is also the former CIO of Community Health Systems, one of the leading operators of general acute care hospitals in the U.S., based in Franklin, Tenn.
"Near-care data probably needs to be handled in media that really focuses on high-speed access," he said. The volume of data also needs to be taken into consideration. "If I just dumped everything I know about somebody into flash [storage] and try to access it in real time, it's still going to be slow because I had to wade through everything to get to the stuff that's relevant."
There are laws that tell you how long you have to keep [data] for, and so it depends on what the law is and what doctors need.
Mike Jackmanpresident and CEO, Mach7 Technologies
Unlike disk-based storage technology, such as a hard drive, flash storage is nonvolatile and has no moving parts, so it won't lose data in a power outage. Flash uses memory cells in a single chip to store data for extended periods of time.
Retention policies and lifecycle management should also be taken into consideration when deciding where and how to store medical imaging data, said Mike Jackman, president and CEO of Mach7 Technologies, an enterprise imaging vendor. "There are laws that tell you how long you have to keep [data] for, and so it depends on what the law is and what doctors need in terms of looking at prior studies," Jackman explained. While there's no federal law recording medical image retention, state laws vary.
According to Seay, medical images are one of the largest and longest held sources of patient data. With the advent of DICOM (Digital Imaging and Communications in Medicine) storage standards, imaging studies have been digital for decades, and regulations require that images be retained for years, if not decades.
J. Gary Seay
"These studies constitute the biggest of big patient data," Seay said. "Common historical practice has been to manage imaging study workflow productivity, image retention requirements, storage costs, image access performance, data security and recovery needs by leveraging tiered data storage and access. Studies needed to support a patient in treatment were held in active, rapid access media; others are archived in lower-cost, less immediately accessible solutions."
It's helpful to apply rules to the storage medium that manages the lifecycle of the information so that the most recent data is readily available, Jackman added. Similarly, there should be rules as to how large a file can be before it's moved to a different storage medium. "We can say, for cardiology, the max size that we should be storing is 10 GB studies, and if it gets above a certain level or below a certain level, let us know," Jackman explained. "This is very important when you're managing a [storage] infrastructure."
We're making these decisions based on … anecdotal evidence. And a plural of anecdotes doesn't equal good data.
Dr. David Delaneyglobal vice president and chief medical officer, SAP Health
Anticipating when and what data will need to be accessed can provide another challenge for healthcare organizations. Data scientists in particular face the unenviable task of pulling data and analyzing it to allow providers to make real-time care decisions, said Dr. David Delaney, global vice president and chief medical officer of SAP Health.
"These are all things where the existing disk-based technologies have not been able to keep up with what needs to be done," Delaney said. "And that's simply because of the latency, the delay when you go to pull a piece of information off a spinning magnetic hard drive and it's between 2,000 and 10,000 times -- depending on the kind of operation you're performing -- slower than pulling it from memory."
Delaney noted that data scientists have been trying to anticipate what healthcare providers are going to ask for and store it logically on a disk so that it can be accessed quickly. "When they do perform, it's pretty good and it's livable," he said. "But when you ask something they haven't anticipated, it's glacial -- think overnight or longer, a day or two -- to get answers back."
The two-tiered medical image storage approach
Ultimately, before medical imaging data can be used for precision medicine, population health or other modern healthcare approaches, it must be stored in a way that makes it easily and rapidly accessible.
"I suspect that a practical strategy is going to be a mix of the technologies that are available," said J. Gary Seay, a principal adviser at BrightWorks Advisory. He continued:
The consumption of this kind of data, the application of it, is likely to be at two levels. One is sort of broadly [analytical]. Those kinds of analyses can run sort of in what I call background mode, and they can take some time, and they're churning through everything. The storage media then has to be safe and secure and it has to be reasonably high-speed in terms of accessibility, but it doesn't have to be real-time kinds of responses.
The other level is more around enabling clinical decision-making at the point of care, and that has to be fast. There's probably two layers, maybe more, that have to blend into this, and flash and some of the high-speed, high-volume kinds of technologies might be where data that is determined to be clinically operationally relevant to care right now is stored.
Medical imaging data and population health
Stored medical imaging data can play a vital role in population health management, but only if it's organized properly and identities are defined, Seay said. Population health most commonly refers to the health outcomes of a group of individuals, and medical imaging data can be analyzed to detect trends in groups of patients to determine appropriate interventions.
But data has to be organized around an individual member of a population before it can be applied to the entire population, Seay acknowledged. That's because the patient profile has to be defined before the larger population can be identified.
"You start to study how many members are in a particular segment, maybe where they live and how they're living, and that's where you start to begin to see a picture emerge," he said. Once the population is defined and identified, then healthcare providers can begin focusing their interventions to improve patient outcomes.
Precision medicine and medical imaging data
Medical imaging data can also be used for precision medicine by looking at what treatments have worked for similar patients. Unlike population health, precision medicine involves making a treatment plan for an individual -- not a group of individuals -- based on genetic, environmental and lifestyle factors. This approach is more effective when there are multiple patient data sources.
"We're making these decisions based on, really, anecdotal evidence," Delaney said. "And a plural of anecdotes doesn't equal good data. And if that's all you had, that would be okay." However, a healthcare organization sees many different patients, some of whom may share a profile similar to the patient in question. If the defined patient population is extended to others in the city, state, country or even the world with a similar profile whose outcomes have already been documented, the provider can then make an evidence-based decision for the most effective treatment plan.
Dr. David Delaney
Delaney provided an example of how medical imaging data can be used to determine the best treatment for a fictional patient, Mrs. Smith, whose MRI showed a tumor. Her healthcare provider uses natural language processing (NLP) to pull out what tumor marker Mrs. Smith has and match it against age, gender, ethnicity and other factors to compare with patients in the healthcare organization and in a cancer patient registry. The system can read and analyze data from those different sources and classify that data based on the content of the report.
In this scenario, the algorithms of an NLP system could be programmed to extract data for cancer patients who fit similar demographics as Mrs. Smith and had a positive outcome to determine the best treatment option. "We're going to pull back this micro-cohort of patients who are very similar to the patient in question," Delaney explained. "But the key difference is they've all been treated, and you know their outcome. And so now we can begin to do precision medicine and personalized medicine where we're leveraging the patient population as similar as possible to meet the evidence-based decision about the best treatment for her."