AWS Glue became a key tool for the Walt Disney Company during the COVID-19 pandemic.
AWS Glue is Amazon Web Services' serverless data integration platform. The tool enables users to build new data pipelines, connect data from different sources, define data, implement data governance frameworks, develop and schedule automated data workflows, and monitor and troubleshoot data management when necessary.
Recently, during re:Invent 2022, a user conference hosted by AWS, the cloud computing giant unveiled a new feature that automatically monitors and manages the freshness, accuracy and integrity of Amazon S3 data lakes and Glue data pipelines.
The Walt Disney Company, meanwhile, is a mass media and entertainment enterprise founded in 1923 and based in Burbank, Calif.
As part of its entertainment, Disney owns and operates six resort destinations, 12 theme parks, 53 hotels and five cruise ships. And when the pandemic struck in March 2020, Disney was forced to close all those destinations.
Eventually, COVID-19 restrictions began to ease, and Disney was able to reopen its properties. Figuring out how to do so safely amid a pandemic, however, proved complicated.
It was going to take lots of data analysis -- including developing an entire slew of new reports, models and dashboards -- to figure out how to maximize attendance while minimizing risk to guests and staff and then implement the plan, according to Ralph Peterkin, principal technical architect at Disney.
Ralph PeterkinPrincipal technical architect, The Walt Disney Company
"We needed to understand the numbers," he said during a streamed session at re:Invent. "We needed to understand what capacity could be and if we were hitting our goals. We needed new analytical processes in place. We needed to spin up several new sets of jobs."
But Disney's existing data integration platform was not suited for the company's new needs, Peterkin continued.
"There were some challenges," he said. "It wasn't simple."
A new platform for Disney
Disney was an AWS customer before the pandemic. But for its data integration needs, Disney culled together open source tools including Apache Spark to build Hadoop clusters. It then ran those clusters on Amazon EC2.
"We used a typical stack," Peterkin said.
But there were problems with the combination of Disney's typical Hadoop stack and EC2 even before the pandemic, according to Peterkin. Clusters were difficult to secure, they were at or near capacity, and it took significant planning and coordination to scale them up to meet Disney's growing data needs.
All that meant diverting resources -- application developers and data engineers -- away from building new data products to doing maintenance work.
With even more demands placed on the company's data due to the pandemic, Peterkin said he and his team questioned whether to continue using Disney's existing data integration tools or pivot to a new platform.
They wondered whether new challenges would require too much work with Disney's old Hadoop stack. And they wondered whether there were other options that would enable Disney's data engineers to use the same skills they had already developed without forcing them to learn new coding languages.
So, they developed a criteria for changing data integration platforms.
If Disney were to change, it needed a serverless computing platform that could be up and running quickly without forcing data engineers to build significant capabilities on top of the platform itself. It needed a Spark-based platform that wouldn't require retraining data engineers, and it needed a platform that was cost-efficient.
"We wanted something that would allow our developers to get in, write code and go," Peterkin said.
Disney found what it was looking for in AWS Glue, which the cloud computing giant first released in 2017. Competing data integration platforms include Microsoft Azure Data Factory, Talend Data Integration and SAP HANA Cloud.
Disney and Glue
With AWS Glue, Disney built data pipelines that its data engineers, software engineers and platform engineers could use to develop the data products that inform decisions.
Disney, however, needed to run tens of thousands of jobs and quickly discovered that its data volume was too much for a typical instance of AWS Glue. But with Glue, Disney was able to use APIs to create clusters of Glue instances, and those clusters have the compute power to meet Disney's data needs.
Now, Disney's data pipelines feed data into Glue clusters, where developers and engineers can create and run jobs.
In addition, Disney was able to standardize its data integration workflow to avoid coding differences.
"We actually built an internal Glue framework," Peterkin said. "There is a [traditional] Glue framework, but our requirements are slightly different [from other organizations], and we wanted to build a standard way for developers to handle things that were Disney-specific. If you give 10 developers some piece of code to write, they're going to do it 10 different ways, and we did not want that to happen."
In particular, it was critical for Disney to build in a standard, governed way to handle compliance and sensitive data that requires different permissions for different employees, he noted.
"We provided our developers with an end-to-end tool," Peterkin said.
The tool -- built with AWS Glue -- uses a PySpark codebase with YAML as the language to run and track jobs without having to write new Spark code, all of which is familiar to Disney developers and engineers from their previous work with Hadoop clusters.
"They just have to specify a couple of things with YAML files, and [the tool] will take care of the rest," Peterkin said.
Disney's transition away from Hadoop clusters to AWS Glue began because of the COVID-19 pandemic. It continued, however, because of its success integrating data with Glue.
Now, Disney has gotten rid of Hadoop altogether and uses AWS Glue for all its data integration projects, according to Peterkin. And as a result, developers and engineers no longer are affected by the technical debt resulting from Hadoop clusters and can instead spend the majority of their time building actual data products.
"We started with a set of COVID workloads that we wanted to spin up and get going because of a specific reason," he said. "But it worked so well that we decided to migrate [preexisting] workloads off of our Hadoop cluster to see how that worked. We got to the point where we said, 'Why not just get rid of Hadoop altogether?' and we moved all our workloads to Glue."
Disney is now running tens of thousands of jobs on AWS Glue, according to Peterkin. And that could soon break the 100,000-job barrier.
"AWS Glue has given us the ability to scale our workloads beyond anything we could imagine," Peterkin said.