Data access key to Regeneron's innovation efforts

After developing a COVID-19 treatment in mere months, Regeneron adopted a data catalog and is developing a data governance framework to speed up its drug development pipeline.

Quick access to data enabled Regeneron to develop an antibody cocktail that healthcare providers have used to help treat COVID-19 patients throughout the pandemic.

That success in rapidly developing its antibody cocktail, for which the Food and Drug Administration granted emergency use authorization in the treatment of mild to moderate cases of COVID-19 in those ages 12 and older in November 2020, has since spurred Regeneron to adopt a data catalog to make data access easier and reduce the time needed to develop treatments for other health conditions.

Regeneron, founded in 1988 and based in Tarrytown, N.Y., is a biotechnology company that develops drug treatments for a variety of medical purposes. Originally focused on the regenerative capabilities of neurotrophic factors, the company branched out and has developed drugs to treat colorectal cancer, rheumatoid arthritis and high cholesterol, among other illnesses and conditions.

Traditionally, it took Regeneron 15 years to bring a new drug to market, according to Patrick Saucier, director of data governance and information architecture at Regeneron. But with better access and organization of its own data -- including a data catalog and the beginnings of a data governance framework -- the company is now aiming to make its research more efficient and reduce the time it takes to develop new medical treatments.

"Access to information allows us to make decisions faster," Saucier said during a webinar hosted by data management vendor Alation. "By finding data quicker, we are attempting to accelerate time to market. How much faster? We don't know yet. But discovery to approval is our pipeline, and we're trying to ensure the data we have is there for people to make decisions."

Screenshot of webinar with Patrick Saucier of Regeneron and Matt Turner of Alation
Regeneron's Patrick Saucier (left) and Alation's Matt Turner discuss Regeneron's implementation of a data catalog in an effort to increase the speed it can bring drug treatments to market.

Data silos

Before the development of its antibody cocktail in the early months of the pandemic, Regeneron's research data was siloed. The company's research scientists each develop their own hypotheses and run their own tests. Every one of those tests generates its own data, as does every component of every researcher's development pipeline.

Regeneron had a data lake where data was supposed to be stored so it would be accessible for subsequent research of both similar conditions and alternative treatments for other conditions, but data didn't always make it into the lake.

According to Saucier, researchers often view research as their own and look at the data they generate as proprietary, so they tend to hoard the data they generate.

In addition, Regeneron didn't have a data governance framework that defined terms and helped standardize the data developed across the organization. Therefore, even when data did make it into the data lake, similar data points that could have been used together to make scientific discoveries remained separate, and the company perhaps missed opportunities.

"Different terminology can impede discovery," Saucier said.

Pandemic response

When COVID-19 began to spread, however, Regeneron's researchers worked in concert with one another. The organization had worked on treatments for similar conditions before, so it had data that could potentially lead to a treatment for the coronavirus.

Access to information allows us to make decisions faster. By finding data quicker, we are attempting to accelerate time to market.
Patrick SaucierDirector of data governance and information architecture, Regeneron

"There was a callout for data," Saucier said. "It was a mass callout asking, 'Who has worked on [something similar to COVID-19], what information do we have, where can we find it?' and then consolidating it together so it can be used to target what we're facing with the pandemic."

Regeneron's scientists put their data in a notebook where the data was then aligned and normalized to make it easily discoverable and actionable.

And within months, Regeneron developed an antibody cocktail that proved effective against early strains of COVID-19. The treatment was used until recently, when data showed that early COVID-19 treatments are not effective against the omicron variant that currently makes up the vast majority of new cases.

"It normally takes 12 to 15 years for a treatment to go from discovery to market, but with COVID we were talking about [a much faster turnaround]," Saucier said. "That was a key element. It brought everyone's minds together. We needed to share the data front and center."

Philosophical shift

Despite making access to crucial data quick and easy in order to respond to the pandemic and develop a treatment for COVID-19, most of Regeneron's data was still splintered in different sources as of late 2020.

But the company's experience developing a treatment for COVID-19 -- normalizing data that was previously defined by researchers, and making it quick and easy to find and take action on -- showed Regeneron what it could accomplish if its scientists had better access to data.

Regeneron shifted its thinking, according to Saucier. No longer was it acceptable to take 15 years to research, develop, test, get approval and take a new drug treatment to market.

In January 2021, just two months after its COVID-19 antibody cocktail received authorization for emergency use, Regeneron adopted a data catalog from Alation, which it named Regeneron Information and Data Explorer (RIDE). Its aim was to make data findable, accessible, interoperable and reusable.

A year after its launch, RIDE now has 153 users who have visited the catalog more than 30,000 times, and it includes more than 11,000 production tables, 317 articles and 302 published queries that can be accessed and used to enable data-driven decision-making.

"Looking at our overall data governance picture, data cataloging was something we looked at to know where data is, who has it and who is responsible for it, how it can be used, and what the quality of the information is," Saucier said. "That assists us in finding and reusing information. It's helped us break down the silos in our organization and get on a path toward enterprise data governance."

Just the beginning

Regeneron's implementation of a data catalog is not an end, according to Saucier. Instead, while it made data access simpler, it marked the start of the company's digital transformation.

It enabled Regeneron to begin developing a data governance framework, but that framework is not finished, nor is the culture shift needed to make the company truly data-driven.

"Our gateway into data governance was the catalog," Saucier said. "Through cataloging, we're developing the foundational elements of governance. We understand where we want to be, and it's out there as our destination. We're evolving the mentality, and using the catalog has gotten people [on board with] our governance journey."

Dig Deeper on Data management strategies

Business Analytics
Content Management