The business case for integrating Hadoop with SAP HANA
Adobe tracks everything its users do and dumps the data into Hadoop. How did the company make the business case for integration with SAP HANA?
Adobe uses SAP Data Services to load Hadoop data into SAP HANA. What was your business case for using Hadoop for data management, and why is it important to have a specific business case on any data project, but in particular when it comes to Hadoop?
We used Hadoop with SAP Data Services because of the volume of data. The Hadoop component is tracking all of the events that happen in the Adobe Creative Cloud. We have 2.5 million paid users and 12 to 15 million free users. Everything those users do in the Adobe Creative Cloud is streamed into the Hadoop system. And so it really was just a big data problem.
The original decision around implementing Hadoop was something I wasn't a part of. But looking at it, it makes sense based on the volume of data that's going in there and the number of events that are being tracked.
It's important to make the business case for Hadoop because Hadoop is difficult, and integrating Hadoop with an analytical engine like HANA is difficult. Many more people are familiar with doing it with a traditional relational database. So if you can, it is likely easier to use a traditional relational database rather than figuring out how to get it to work with Hadoop. Also, there are so many more people who have Oracle experience or SQL Server experience, so if you needed to hire some additional resources to help, it would be easier to find those people than it would be to find someone who has extensive Hadoop experience and who knows how to integrate that data into HANA.