https://www.techtarget.com/searchenterpriseai/tip/How-to-use-AI-agents-for-infrastructure-management
Most organizations that have invested in AI tooling for their infrastructure teams aren't getting the return they were promised.
Gartner projected that global spending on AI-optimized IaaS will reach $37.5 billion in 2026. However, much of this spend will underdeliver. In a Gartner survey of 782 infrastructure and operations leaders, only 28% of AI use cases in infrastructure and operations fully meet ROI expectations, while 20% fail outright. This isn't the result of the models being inadequate but rather of an incomplete adoption strategy. Simply switching AI vendors or allocating more budget for costlier tools won't fix this.
To realize the benefits of AI agents automating infrastructure development, businesses must optimize their agents and provide them with business-specific data. Learn how to equip AI agents with the data they need to succeed and how to address the serious security and operational concerns this technology can pose at the infrastructure layer.
Engineers across organizations are treating AI agents as a smarter search engine rather than properly embedding them into their platforms. They throw every incident, error and configuration issue at any random AI agent, expecting it to magically solve them. But in most cases, they end up with generic responses that, while correct in a vacuum, sound authoritative and seem helpful at a glance, are wrong for their environment and can break production.
AI agents can write infrastructure code, designing configurations and reasoning through complex problems. But they have a structural blind spot that no prompt can overcome: They're limited by their training data. Developers of more general-purpose models, such as Claude Code and GitHub Copilot, only train their models on publicly available data. By default, these agents don't know how a specific company operates. This includes the following:
Engineers can spend many hours fixing and tweaking these AI agents to ensure they integrate effectively with their systems, thereby defeating the expected productivity gains. This is the gap that CIOs and executives must close when evaluating AI tooling for their infrastructure teams. Selecting an AI agent is half the battle. Whether that agent can deliver depends on how organizations feed it institutional knowledge.
There are three approaches businesses can use to feed their AI agents information on their infrastructure.
Knowledgeable engineers include business-specific instructions with prompts from memory. It could be as simple as: "Within this company, we use … " This only works because the engineer happens to remember the correct information. This method can become unreliable and unscalable when engineers get critical details wrong or when new team members lack necessary information.
Engineers can point AI to the location of documentation describing internal standards, likely in a Markdown file. They could also choose to copy its contents into every conversation with the model. However, this is a manual process, and given how slow teams can move, documentation can quickly become stale.
More critically, organizational knowledge isn't just a handful of documents. It consists of valuable knowledge scattered across git repos, Notion pages, Confluence pages, Slack threads and Zoom transcripts. Many of these sources overlap and contradict each other, so the stress of copying and pasting for every AI interaction is unsustainable.
Realistically, a document might cover different topics. It's inefficient to feed AI agents every detail when they only need information for the task at hand. Businesses should implement retrieval-augmented generation (RAG) with two pipelines: one for ingestion and the other for retrieval.
The ingestion pipeline captures company documentation from wherever it lives and breaks it down into data. Vector databases store, manage and index this data. The retrieval pipeline receives queries from the engineer and sends them to a Model Context Protocol server. An MCP server converts queries into embeddings and performs a semantic search against the vector database to retrieve relevant data. The LLM combines specific operational context with its general knowledge to generate a response.
A Kubernetes controller can automate document ingestion, keeping the pipeline running continuously and in sync with documentation and resources as they change. For most infrastructure teams, Kubernetes is already where the workload lives, so there's no need to introduce a separate orchestration layer.
Be aware that RAG adds some infrastructure complexities because there are several moving parts. Also, data quality is critical because poorly structured data can lead to unreliable results.
Data can get stale, too. If it remains in the vector database after someone updates the source documents, the RAG will retrieve conflicting information. Engineers should design the pipeline to remove old data rather than just append new data.
As AI agents get more embedded in infrastructure, they become a first-class security and compliance concern. The following are three key security areas businesses must address early:
The two main operational challenges that engineers must prepare for when using AI agents for infrastructure development are context window restraints and cost.
Eventually, agents will be working with a lot of data from various sources. If engineers keep piling this data into the AI agent's context window, it will soon fail. Broader context doesn't provide better results. Instead, it can lead to degraded performance, higher costs and inaccurate responses that make the system useless.
To prevent this, each interaction with the MCP server should start with a completely fresh context. The MCP gets the relevant information it needs to handle the specific task, not minding when that information was originally fetched or created.
Costs for agentic AI systems multiply quickly when running multiple systems simultaneously. A single query could trigger a multistep reasoning chain that calls multiple tools and burns through tokens. With model routing, engineers can route different types of requests to agents running different models.
Performing the routing in the model itself works better. The agent can decide what model to use for what task. For simpler tasks like summarizing and classifying data, engineers can use a cheap model and save more powerful models for heavy reasoning.
For IT leaders making or defending investments in agentic AI within infrastructure, the architecture that truly delivers on the promise should include the following:
Wisdom Ekpotu is a DevOps engineer and technical writer focused on building infrastructure with cloud-native technologies.
21 Apr 2026