Denned - Fotolia
It's difficult to define and build an infrastructure that enables AI and machine learning to integrate with existing operations.
One of the biggest problems enterprises run into is using a development lifecycle that worked with DevOps when developing and deploying AI. Instead, they need to employ new principles in their development lifecycle encompassed in ideas like AIOps and MLOps.
Another problem is that modern AI applications are cobbled together across many best-of-breed tools that can cause the rise of shadow AI without integration into a coherent infrastructure. Shadow AI refers to AI that isn't under the control of a company's IT department and may not have proper security or governance measures.
By addressing data infrastructure challenges, companies can promote collaboration and a shared understanding across teams.
Managing the AI infrastructure
The number one infrastructure issue arises from the mismatch between how AI and machine learning models are built and how they are deployed. This can strain data centers as data engineers struggle to keep up with new models developed by data scientists.
When an enterprise sees delays in AI deployments, even though it has invested in more data scientists, it's a clear sign of this problem, said Kenny Daniel, CTO and co-founder of Algorithmia, a machine learning infrastructure provider. An Algorithmia survey on enterprise machine learning trends found that in the last year, 83% of organizations have increased their AI/ML budgets and the average number of data scientists employed has grown by 76% -- but the time required to deploy a model is going up, with 64% of organizations taking a month or longer.
IT teams should consider fleshing out the infrastructure to address the lifecycle of AI projects using an AIOps or MLOps approach that has proven successful in accelerating software development.
"AIOps can provide the data engineering framework, model management, governance mechanism and a workbench of tools and methodologies to run AI models at scale," said Sanjay Srivastava, chief digital officer at Genpact, a digital transformation consultancy.
He sees AIOps playing a role in virtualizing AI infrastructure, managing cloud-based privacy concerns and fostering AI ethics. Key to getting this right is bringing in oversight that is independent of the AI project. This can reduce unintended bias, constrain it to the proper use case, and design for inclusion and comprehensiveness.
"In the end, we believe most companies, like a financial audit, will bring AI ethics as a board agenda item," Srivastava said.
Companies earlier in their AI infrastructure journey should start as simply as possible to work out the bottlenecks in their unique workflows.
"Deploying a basic first version of your solution and then benchmarking, iterating and improving from there will help to prevent issues in production," said Andrew Maguire, senior machine learning engineer at infrastructure monitoring startup Netdata.
Getting data infrastructure ready for AI
Modern AI infrastructure is often cobbled together from various best-of-breed components, with each handling one task well.
"One of the biggest AI infrastructure issues we are seeing right now is organizations struggling to choose the right combination of tools to serve their AI stack," noted Jerry Kurtz, executive vice president of insights and data at Capgemini North America.
The landscape of AI tools and AI-specific offerings in this space is so large that identifying that right combination becomes a challenge, as organizations are faced with too many choices.
So many AI tools make it more difficult for companies to integrate the tools they do select with whichever legacy systems they currently have in place. This is a major constraint on organizations trying to build a whole stack of tools in a way that ensures the tools work cohesively together.
Sanjay SrivastavaChief digital officer, Genpact
AI infrastructure also creates challenges around new data center cycles and longer cycle times, as developing the right data center design for AI-specific needs requires new skills and workload knowledge, said Holland Barry, senior vice president and field CTO at Cyxtera, vendor of a data center platform.
AI infrastructure implementation cycles are typically longer too. It can take months to procure, integrate and troubleshoot all the various layers of technology needed to support AI at scale.
This has led to a growth in shadow AI practices rolled out in the absence of IT involvement. Data scientists and developers can't find the resources in-house, their company's data centers weren't engineered for AI workloads or the IT team couldn't move fast enough to meet a deadline.
"With a cloud instance just a quick credit card swipe away, a shadow AI strategy sometimes feels like the only way to get your project off the ground," Barry said.
He recommends that IT take the lead with infrastructure to shift AI from hiding in siloed projects to something resourced and centralized as a shared service. Centralization can improve the ability to develop and replicate an IT standard that ensures the optimal balance of data center resources across compute, storage and networking. It can also help bring AI out of the shadows and help democratize access to AI services across the organization.
IT, business and AI engineer collaboration
Another key infrastructure challenge can arise from an overemphasis on the models rather than the infrastructure for coordinating the data required to build models.
This can lead to challenges in integrating AI and machine learning into existing business processes, said Dr. Manjeet Rege, director of the Center for Applied Artificial Intelligence at the University of St. Thomas in St. Paul, Minn.
He found that it's common for companies to retrieve data from data warehouses and hand it over to the AI team without a proper hypothesis. As a result, a lot of AI projects work at a prototype level but struggle going into production due to the disconnect between business, IT and AI engineers.
"The focus needs to shift from keeping the data constant and endlessly tweaking the model to cleaning and prepping the data and building a model based on that information," Rege said.
IT staff helps provide the supporting infrastructure while AI engineers build AI models that benefit the business using the IT infrastructure.
Project teams with members from IT, AI engineers and business teams must work together.
"This not only helps the AI engineers to understand the business problems better but also helps the business and IT teams to appreciate what kind of data needs to be collected for AI projects to succeed," Rege said.
Another problem involves discrepancies in the scale of data infrastructure. Data scientists learn to build models on a small scale, usually on their laptops. However, enterprises produce much larger data sets that they store on many servers.
AI researcher Adrian Zidaritz found it helps to clarify the structure of the data and the way various relationships between different data items have been firmly established from the outset. Enterprises often neglect this upfront effort. It's also helpful for companies to develop a data vocabulary on which all data engineers, data scientists and management can agree.
Another component of data collaboration lies in improving data access.
When it comes to the data experience, most companies juggle disconnected data silos, challenging transformations and mismatched views, said Fritz Heckel, senior staff machine learning engineer at ASAPP, an AI customer experience platform.
Pernicious errors can crop up between prototype and production models, data may deviate when online versus offline and cloud privacy concerns add complexity to the entire data pipeline, especially around data removal.
As a result, enterprises unintentionally task highly skilled AI workers with managing problems that slow production and introduce product risk.
Heckel's team adopted the following three best practices that have helped improve their data collaboration infrastructure:
- Ensure you have a single source of truth. It should be obvious how to correctly access data, whether it is the latest data, versioned data, data for a given client or the latest version of a given data transformation.
- Provide single sources of authorization. Don't make employees juggle different ways of logging into systems or different access credentials.
- Develop a common language for talking about your data assets, and from there, buy or build a common API to tie together your resources. "Even if the back end is messy, improving the data experience will do wonders for AI worker efficiency," Heckel said.