Good governance key to reducing high AI project failure rate
Better governance of cutting-edge applications and the data that feeds them are key to overcoming development obstacles, Databricks exec Craig Wiley said in a recent interview.
Ever since OpenAI's November 2022 launch of ChatGPT marked significant improvement in generative AI (GenAI) technology, enterprises have increasingly invested in developing AI-powered applications that can make employees better informed and business processes more efficient.
According to research and development firm Gartner, worldwide spending on AI projects will reach $2.5 trillion in 2026 and $3.3 trillion in 2027, up from $1.8 trillion in 2025 and just under $2 trillion in 2024.
Meanwhile, from formerly niche vendors such as Alation and Informatica to platform vendors including Databricks and Snowflake, AI and data management vendors have created development frameworks designed to make it easy for customers to build complex AI-powered applications.
Yet despite all the money enterprises are pouring into AI development, and despite all the efforts vendors have made to simplify creating cutting-edge AI tools, the AI project failure rate remains staggeringly high, with estimates ranging from about three-quarters of all initiatives to as much as 95 percent.
While the reasons AI projects don't make it past the pilot stage into production vary depending on the enterprise, a primary one is that AI tools aren't fed the right data. Whether the problem is that relevant data for a specific initiative can't be found and retrieved, data is isolated in disparate systems and can't be accessed, or that data simply isn't high-quality, data-related issues stall AI initiatives.
After development frameworks made little difference, some vendors in early 2026 have tried different means of reducing the failure rate of AI projects. For example, Databricks launched Instructed Retriever to improve data retrieval accuracy, MongoDB released new embedding and reranking models to similarly increase the accuracy of retrieval pipelines, and vendors such as GoodData and Snowflake have made semantic modeling a priority to make data more consistent and discoverable.
Craig Wiley, vice president of product for AI at Databricks and formerly an AI and machine learning product leader at AWS and Google Cloud, has witnessed enterprises' AI initiatives fail and the attempts vendors have made to better enable successful AI development.
In a recent interview, he discussed the problems that stall AI initiatives before they make it into production and what enterprises can do to improve the likelihood of a project delivering on its promise. In addition, Wiley delved into whether the problems enterprises face today will eventually disappear and whether new ones will arise as AI continues to evolve.
Editor's note: This Q&A has been edited for clarity and conciseness.
I know issues with data are a huge reason for the high failure rate of AI projects, but before we delve into the problems with data that prevent pilots from moving forward, what are some other reasons that AI initiatives fail?
Craig Wiley
Craig Wiley: When I talk to CIOs, it comes down to folks asking three questions. The first is, 'Can I control this thing?' At first, there was fear that AI was going to steal an enterprise's data. Now, there's a fear that it's going to hit an API it's not supposed to hit. The second is, 'Does it work?' That's about asking it to do something and whether it does it and does so in the way that was expected. The third is, 'How much does it cost?'
I call those the CFO test. If all you're trying to launch is something that reads emails and prioritizes them, the CFO doesn't care. But if you're going to launch something externally facing, you're going to need to justify that with the CFO and the CIO, and it comes down to whether you can control it, whether it performs and whether you can afford it.
Now delving into data, what is the biggest data-related problem you see companies face that results in AI projects failure, and why does it halt projects before they're moved into production?
Wiley: I call this the CEO W2 problem, which is that if I build an agent that, for example, helps my organization with HR-related things, I want to make sure it doesn't share the CEO's W2 with anyone else.
It's not just about how I can build these things, but it's also about … the governance -- whether I can ensure that whoever is communicating with this system has access to whateverit is the agent accesses. I often hear folks say they'll just manage the agent's identity. There are times when you want to manage an agent's identity, and there are times when you want to have that agent be at the behest of the user's identity. You need that governance layer so that I can confidently tell my CEO that I won't share anyone's W2 with anyone that doesn't explicitly have permission to view that.
If all you're trying to launch is something that reads emails and prioritizes them, the CFO doesn't care. But if you're going to launch something externally facing, you're going to need to justify that with the CFO and the CIO, and it comes down to whether you can control it, whether it performs and whether you can afford it.
Craig WileyVice president of product for AI, Databricks
This technology was so easy to use when it first launched. People would say, 'Rewrite the lyrics to Hamilton so it's about my family.' What we're forgetting is that when it got it wrong, we just smiled and nodded. Now, if it's referencing data, I want to make sure I have it locked down so that it can only see, and the user can only see, the data we explicitly want it to.
How can a level of governance be put in place the results in more successful AI development?
Wiley: We see that companies that are using a data catalog to manage not only data but also agents and all the things those agents connect to -- a catalog that can really manage the entire agent sprawl -- are launching 12 times the number of agents compared to those that aren't. [Successful development is] about this idea of whether it can be controlled, whether it works and what it costs.
Governance is about that first one, whether it can be controlled at every step and what exactly it can and can't do as far as what systems it can access and what data it can access.
What's an example of a company using a data catalog to govern data and move AI projects into production?
Wiley: Databricks works with a company called Edmunds.com, a car-shopping website. They had data in all different kinds of systems and none of it was connected. They created a chat system where they could talk to and access all this information. It had this massive impact, both on customer experience as well as now being able to identify and understand dealership opportunities and traffic around the dealerships and what have you. By having that governance layer, not only does it protect but it also creates the discovery to deliver the most relevant data given the problem you're trying to solve.
Beyond lack of governance, what is another data-related problem that is contributing to the high failure rate of AI projects?
Wiley: Folks have data in lots of different systems. Whether different databases and operational systems, or whether it's SaaS systems, accessing the data is a problem. So, how do you access it while protecting privacy and having access controls, and how do you access the right information?
We offer a lot of models, so we see a lot of people switching out what models they use to drive costs down -- maybe they can get away with something cheaper, or an open-source model. Smaller models can perform on par with some state-of-the-art models, but only if we really nail the information we give them to answer the question we're asking. That can be done with access to as many data silos as possible so the most relevant information is available for the problem that is trying to be solved. You need access to the data to drive the accuracy.
Is data quality – of lack thereof – still a problem that plays a part in the high failure rate of AI projects?
Wiley: I'm lucky because a lot of customers I talk to have already made an explicit investment in cleaning up their data, so I see the world through slightly rose-colored glasses. Having said that, this is absolutely a problem. The good news is that you can use generative AI to accelerate solutions to this. Things like master data management used to be really complex. Now, it's a lot easier. When you can get data cleaned up using generative techniques, it makes it a lot easier.
My first role in cloud AI was as the general manager of AWS SageMaker. We have the same problem today as we had back then with machine learning, which is that if you had garbage data you had a garbage model. If you're building an agent and have garbage data, you're not going to have good accuracy and outputs.
It seems like most data management vendor announcements over the past few years have been about enabling users to successfully develop GenAI chatbots and agents -- why haven't the tools organizations have been using lowered the high failure rate?
Wiley: Because this technology comes off as so easy when you first interact with it, all the vendors have been trying to meet the expectation of making it simple. In our imagination, front-line employees and CEOs can build an agent, so everyone has been trying to simplify, simplify, simplify. The cost is that folks haven't been paying attention to what it takes to build a high-quality system.
We work with a leading women's healthcare app that has 420 million customers globally. They have a system that users can chat with about what they're experiencing. To build a system of that caliber takes teams of people. We've been talking about governance and evaluation, but if you walked into a meeting of a random AI or data platform provider with their customer, nine times out of 10 they would be talking about how they're going to make it easy for the customer to build agents. Of course we want to make it easier to build agents. But we don't want to do that if it's at the cost of building agents they can rely on. That's been where the industry has missed the mark.
If you were building an AI development stack from scratch, what capabilities would you use?
Wiley: I would have the ability to ingest any data, and the ability to parse and make discoverable that data. I would want to have the ability access Model Context Protocol servers so I can access other systems. I would want to have a system that is metadata aware of all those data assets, knowing where the data comes from, who created it, how often it gets used and other metadata that tell me this is something I should pay attention to. Once I've got that, I would want access to every model and every orchestration system, and then I would have strong evaluation capabilities.
The folks that are building these systems are application developers. They're not expert agent developers. They're app developers that in the last six months found out that agents exist and now they have to build them. We should be making it as easy as possible for them to tune the performance of an agentic system. Lastly, quality monitoring and logging to drive a continuous learning loop.
While building an AI development stack from scratch is ideal, most organizations can't simply overhaul everything, so what is the number one thing you would recommend they do to reduce the failure rate of their AI initiatives?
Wiley: It would really be about making it all discoverable and making it able to control what agents are going after. We've seen our catalog business, in terms of AI governance, grow seven-fold over the last nine months. That is the missing key to creating enterprise AI. It's one thing if someone is at home and has OpenClaw running on their Mac mini, but when it's done in the enterprise, the rules are different. And they're different for a good reason. We've been doing data governance for 20 years. … Last year, everyone wanted to know how to create an agent. This year, everyone wants to know how to manage all the agents they've created.
We're just over three years out from ChatGPT's launch, which kicked off this new AI era. In another three years, do you think the AI project failure rate will be way down?
Wiley: I think it will. But this is an extraordinarily new set of capabilities, and it's moving unbelievably quickly, so my hot take on this is that there will still be a frustratingly large number of failed experiments. I say that because the frontier, the boundary of what's capable, is going to be moving so fast. In three years, people will have built a lot of their own software services and will have more and more agentic systems at scale, but because the frontier of capabilities is going to continue to move so fast over the next few years, because we're going to continue to learn how to get so much better at this, what we're going to be asking of these systems is also going to be growing exponentially.
We'll get good at the stuff that seems hard today. That's not going to be the stuff that's hard three years from now. What will be hard three years from now, I don't know, but I'm extraordinarily excited.
Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than three decades of experience. He covers analytics and data management.