DataOps is continually rising in organizations that understand the necessity of accessing and governing the data used to feed analytics engines and machine learning.
Like DevOps, DataOps is about getting something into production. With DevOps, it's code; with DataOps, it's data.
This year, organizations asked fewer questions about what DataOps is because its general definition is broadly understood. Less obvious may be the business value it delivers, especially if approaching DataOps as another technology to implement.
"There are a lot of tools, systems and technologies available, but the challenge is to find the one which matches your organizational needs," said Sush Apshankar, solution lead of analytics and AI/ML at global technology research and advisory firm ISG. "The second piece is people. You need to build that data-first mindset and then get the right people on the team."
Importantly, DataOps doesn't just help ensure that the necessary data is available. It also ensures the timely delivery of that data and that the data is properly governed.
"The DataOps teams should be focused on simplifying and optimizing current data pipelines and inserting their people, architecture, best practices, designs and patterns into all new data development efforts," said Dan Sutherland, senior director of technology consulting at global consulting firm Protiviti.
What's falling through the data cracks
Some DataOps teams are building robust data pipelines to deliver tangible business value, while others are still struggling to get it right.
Data and AI projects are notoriously slow to progress from ideation to realization. There's usually no consistent path or pattern for businesses to follow other than loading the data into an existing data lake and figuring out how to find, filter, cleanse and leverage that data. Meanwhile, IT departments are usually backed up with high-priority operational requests, so the business may leverage its own staff to complete the work without a clear, defined path to completion.
A second issue is that some organizations don't understand the value of DataOps, so it lacks the budget and staff necessary to succeed. Common problems most organizations face include the lack of classification for critical data elements, no single trusted inventory of available data assets, and minimal or inconsistent data definitions. There is no defined standard, discipline, methodology or consistency for development, deployment and operationalizing data and AI assets, Sutherland said.
"DataOps principles ensure there is an accurate, up-to-date inventory of all data and AI-related assets and processes, and any new project must adhere to the standards and update that inventory accordingly," he said.
Of course, the data engineers building data pipelines need the support of other roles to deliver business value, which typically includes some combination of a chief analytics officer (CAO), a chief data officer, a chief digital officer, data scientists, data analysts and data architects.
"Number one is how does it integrate with your existing systems and how to do you marry the technology KPIs with the business KPIs?" Apshankar said.
Some examples of technology KPIs are system downtime, the number of tickets raised and the swiftness with which the system was brought in. The business KPIs include whether people are actually using it and how it helps reduce time and cost, Apshankar said.
In short, DataOps necessarily bridges the gaps between making data available in the first place and driving positive business value with it.
Future DatOps trends
The most mature DataOps teams adopt a DevOps mindset of continuous improvement, which can involve many processes and practices. Though data pipelines speed data access and make working with data more reliable, they are not "set-and-forget" assets even though they may be automating some tasks. Pipeline performance can degrade if system interconnectivity is brittle, the data is not delivered in a timely manner, or a necessary resource falters or changes.
"Most data, analytics and AI projects fail because operationalization is addressed as an afterthought," Sutherland said. "The team should be evangelizing the business value of DataOps to the organization and monitoring all new and enhancement projects to ensure DataOps is an integral part of the development and deployment for each initiative."
Some of the skillsets Sutherland thinks organizations must develop include the following:
- Understand cloud-native technologies and platforms.
- Keep abreast of new products and product improvements
- Build skills in data and AI-related agile software development lifecycle methodologies, development, unit testing and integration frameworks.
- Build skills in reuse and automation.
- Leverage collaborative data development tools that let them to work more effectively with data architects, data engineers and data scientists.
DataOps teams need to possess many of the skills that DevOps teams already have, which is one reason why to consider developers as prime data engineer candidates. Like DevOps, DataOps ensures the timely delivery of a quality product; in this case, that is using data for analytics and training data for machine learning.