Sanjay Srivastava has to make a fundamental choice when designing and building data management environments: Go with open source data management technology, or buy commercial options?
For Srivastava, who works with clients on such choices, it comes down to the role data plays for the company.
"If you're born in the cloud and data is a core value and a driver of your business, then go with open source," said Srivastava, chief digital officer at business transformation services firm Genpact. "But not if you're setting something up in your own environment, and you want to run it long term, and you're using data to augment and support your core business."
Many enterprise data officers and IT leaders find themselves in a similar situation: forced to choose between proprietary software and open source for their data management needs.
It's a choice that will likely come up with more frequency as the space expands.
Just consider the market size already and its expected growth. Grand View Research valued the global enterprise data management market at $72.8 billion in 2020, with an expected compound annual growth rate of 13.8% through 2028.
The large dollar value reflects the significant range of capabilities needed for an enterprise data management program, as well as the numerous vendors and open source options on the market.
That, though, is at the heart of the challenge -- the plethora of options. And, while Srivastava has a particular way to help him decide on whether to go with proprietary or open source when building up an organization's data management program, he and other experts said each enterprise must understand the benefits and challenges of those options and recognize that the choice between the two isn't always clear-cut.
Rather, there's a series of tradeoffs that need to be considered.
"Open source offers cost-efficient alternatives to high-cost, commercially off-the-shelf products," said Sandhya Balakrishnan, U.S. region head for intelligent enterprise solutions at Brillio, a digital transformation consulting firm. "However, most common concerns around open source data management tools include security, lack of support and, many times, hidden costs around installation and ongoing maintenance."
On the other hand, "many commercial data management tools successfully shield the complexity of data management for the users," Balakrishnan added.
Proprietary data management software pros and cons
Proprietary software is software whose source code is unavailable to users; it's sold by commercial entities as off-the-shelf solutions that may or may not be customizable to any degree.
Although its definition may make it seem overly rigid, IT advisors stressed that proprietary data management systems offer several significant benefits to organizations.
"The advantage is, even though it's something you paid more for [than open source], it will work in the enterprise for production," said William McKnight, president of McKnight Consulting Group.
Many vendors deliver solutions with a full range of complementary capabilities. They build in integrations so that enterprise teams can more quickly and easily build out their data management environments. And they're adding more automation.
"They're reliable. They can deliver high performance at scale, security, innovation and automation," added Noel Yuhanna, vice president and principal analyst at Forrester Research.
Organizations also get vendor support when opting for proprietary data management technologies, and they typically find that it's easier to hire the talent necessary to implement and maintain commercial data management software -- particularly the most commonly used ones -- vs. open source options.
Those are important considerations for organizations looking to quickly advance their use of the data, experts said.
"Usability from a development and maintenance standpoint, as well as assurance of ongoing support and enhancements, offers large enterprises the ability to scale by focusing on the right aspects of enterprise architecture," Balakrishnan said.
However, proprietary data management software can come with some potential downsides, according to experts
Enterprise teams can't innovate on proprietary code and instead must rely on the vendors to keep pace with the innovations necessary to succeed in a rapidly evolving digital landscape.
It costs more -- particularly in upfront fees -- than open source options.
And there's the chance of getting locked in with one vendor, with the cost and challenges of switching to another vendor overwhelming the benefits of making the change.
Open source data management software pros and cons
In contrast to proprietary data management software, the open source options are released under a license that lets users deploy the code to develop their own systems and also update, change and modify it for their own needs.
That flexibility enables the creation of data management solutions that fit each organization's own unique needs, McKnight said.
"With open source, if you're so inclined, you can create forks of your own in the code. For some, that might matter," he said.
Open source is also less expensive to use.
"It's obviously advantageous to the budget, so if money is a constraining factor, then you can go with open source," McKnight added.
Moreover, enterprises generally can test open source options more easily, enabling them to run a proof of concept or pilot before deploying open source more broadly or even graduating to the paid/higher-cost enterprise versions.
Additionally, open source enables enterprise teams the ability to innovate on the code and to draw on improvements that other users bring to it.
"With the open source community, you have more people contributing to lines of code, so you're going to get more innovation," Srivastava said.
Those benefits have many enterprise IT and data leaders looking at open source, Yuhanna added.
"The [COVID-19] pandemic seemed to up the look at open source," he said. "What we've seen is that open source tools definitely help you lower your costs. That's one of the driving forces behind adopting it, but open source can also help you avoid vendor lock-in and future-proof your architecture."
As is the case with proprietary software, open source comes with some possible drawbacks. Open source is generally harder to integrate than proprietary alternatives, Yuhanna said, adding that it typically takes more work to get it to interoperate well.
"That's work that needs to be done on-site," he said.
Furthermore, experts said organizations need technologists with the specialized skills required to build, deploy, maintain and improve the open source code. Those technologists must keep up with any changes to the code, and they may be required by the license to contribute back to the open source community. They also have to be capable of doing all that work without the 24/7 customer support that typically comes with commercial software products.
The future is a mix
Enterprise IT and data leaders may not need to choose between proprietary and open source -- it doesn't have to be an all-or-nothing approach.
Rather, experts said, they can use proprietary software for some needs and use open source for others. That, in fact, may be the optimal approach for many organizations.
"In our experience, best of breed is the norm with increasing acceptance of open source," Balakrishnan said, noting that organizations might go with commercial providers for data movement and storage but use open source options, such as Apache Kafka or Apache Spark for near-real-time data processing and Apache NiFi or Apache Airflow for orchestration or workflow management.
Others said organizations might want to use open source for pilots and proofs of concept and then move to commercial choices when scaling.
Enterprise IT and data leaders are also increasingly turning to solutions that essentially combine the two, said Brad Ptasienski, partner at digital consulting firm West Monroe and its data engineering and analytics market lead.
Noting that West Monroe believes open source "is going to be the new standard and mainstream for large-scale data processing and storage," Ptasienski said he sees a lot of positives in using commercial "wrapper" solutions that have open source at their core but with some of the advantages of proprietary software wrapped around it.
"It's almost a hybrid approach," he added. "It works more like a platform, but it's open source at its core."