sdecoret - stock.adobe.com

Guest Post

Beyond the chatbot: Engineering the agentic enterprise

Agentic AI is taking over where chatbots leave off. These systems let enterprises automate workflows, driving operational change and doing real work.

The fun of the friendly chatbot has worn off. Most enterprise AI applications today feel more like overenthusiastic interns who can summarize the last meeting but haven't figured out how to get the real work done. We've solved the conversation. Now it's time to solve the work.

In many corporate boardrooms and engineering departments across the world, there's an unspoken concern. After almost three years of enthusiastic spending on generative AI, most enterprises are at the same point: The demos are slick, and the stakeholders are guardedly optimistic.

All the major platform technologies that transformed business have gone through similar scenarios. They started with a proof-of-concept (PoC) demonstration, then a trial or pilot deployment, followed by a massive infrastructure rollout to support it. Ultimately, businesses benefit through the operational transformation. Similarly, enterprise AI has a defined path to follow, as we enter the infrastructure rollout of what we call agentic AI.

Throughout my 10 years as a product lead and architect on technology transformations in the IT industry -- from ERP modernization projects to migrating applications to cloud-native environments -- I've seen this pattern repeat itself with each major technological disruption. Those who ultimately succeed are typically not the ones with the most advanced models or the biggest budgets for AI research. Rather, they're the ones that resolve the infrastructure problem -- the "plumbing" below the intelligence.

The proof-of-concept purgatory problem

PoC purgatory is where things can get bogged down. It's the result of a fundamental architectural flaw. Enterprises created their AI strategy based on a completely passive model: retrieval-augmented generation. RAG systems are quite good at one task -- allowing a large language model to have access to a curated knowledge base so it can answer questions with relevant, up-to-date context.

Those who ultimately succeed are … the ones that resolve the infrastructure problem.

However, RAG systems are incapable of acting. They can't start tasks, plan, route a task to the correct system, recover from a failed operation or make decisions in ambiguous situations. RAG systems are simply very intelligent reference librarians. But enterprises don't need a smarter reference librarian; they need capable operators. That's where agentic AI can help.

The 3 pillars of production-grade agentic systems

As the term agentic AI becomes increasingly trendy, it will likely lose its definition. An agentic workflow is a system where an AI model doesn't simply respond to a prompt. Instead, it pursues an objective through iterative steps, using available resources to create intermediate products, evaluating intermediate results and adjusting how the model approaches the problem based on what the model has observed.

The following are the three pillars that make up production-grade agentic systems:

1. Orchestration frameworks

A reasoning loop framework is the same as a framework for agent behavior. It might be LangGraph, AutoGen or the tool-use primitives being developed at Anthropic, among a few other possibilities. Each of these frameworks can be viewed as multi-step, multi-tool agent execution templates, ensuring agents follow predictable, auditable workflows. One of the key engineering decisions businesses need to make is how to lay out the knowledge graph topology to define the agent behavior and the structural relationships between them.

2. Tool integration and state management

An unproductive agent isn't going to be any different from a sophisticated chatbot. An agent influences its environment using tools such as APIs, database connectors, code execution environments and web browsers. The Model Context Protocol is emerging as a de facto standard for tool-to-model communication; it isolates an agent's knowledge base from its actions.

It's only when we try to implement state management in an enterprise application that we realize how difficult it is to have a highly dynamic and distributed system while keeping an agile underlying framework. An agent that's responsible for performing business workflows, which can take hours or days to finish, must store a large amount of data that can include analysis results, business decisions, user input and error messages. So, state management is just another aspect of the complexity of designing a good persistence layer for memory, sessionless requests, agent restarts and parallelization. It shouldn't be ignored.

3. Safety guardrails and human-in-the-loop design

This is the area with the most opportunities for simplification and is thus often the place where the most shortcuts are taken. It's also often the source of the first publicly disclosed problems. Any truly autonomous system needs a clearly defined workspace with a set of access control policies for tools that cover at least the readable and writable use cases where human approval is required before running.

The most consistent obstacle to the successful deployment of agentic enterprise systems isn't technical architecture, but rather the lack of organizational readiness.

This is a design pattern, not a failure recovery mechanism. The agent acts autonomously for as long as it remains within a set of confidence bounds; the issue surfaces to the human operator when the agent finds itself outside those bounds. Acting autonomously as long as possible leads to more reliable system behavior. In addition, the trust an organization puts in an agent will grow over time, enabling the agent to operate independently for longer.

The organizational readiness gap

In my experience, the most consistent obstacle to the successful deployment of agentic enterprise systems isn't technical architecture, but rather the lack of organizational readiness.

Agentic AI has exposed all ambiguities in business process design that human employees have compensated for through institutional knowledge, informal escalation and contextual judgment. An agent won't know how to handle an undefined exception; it will simply follow the rules it was given. If those rules are incomplete or conflicting, then the agent will fail, -- often in a way that's both visible and expensive.

Organizations serious about deploying agentic enterprise systems should consider process documentation as a requirement for successful deployment, not as a deliverable.

A practical roadmap: From chatbot to agent

The following are the four phases to follow when transitioning from a chatbot to an agent:

Phase 1. Audit

Describe the five-to-10 business workflows that the chatbot or RAG supports. List each step of the end-to-end business workflow, including all the systems that are engaged, exceptions and points where a human must intervene. This gives you a list of business workflows that are suitable for agent automation. It also highlights any steps that need to be addressed before the workflow can be automated.

Phase 2. Instrument

Before sending an agent off to do its thing in the wild, there are a few pieces of setup to get right. These include trace logging any calls made to tools, telemetry on latency and decision costs, and a good set of unit and integration tests to catch regressions. If teams can't see the agent at work in production, they'll never be able to debug, improve or trust them.

Phase 3. Scope and sandbox

The first production class of agents should be deployed in a high-scope use case that's considered low risk, because the agent will perform read operations rather than write operations, and the success criteria are tightly defined. The following are examples of this type of use case:

  • Triaging customer support queries.
  • Routing staff-generated documents.
  • Data validation against known data sets.

Agents should run in "shadow" mode, where they can execute but aren't committed to next steps if they make an incorrect decision. They will run in this mode until they have reached the required accuracy threshold on a sufficiently large sample of data.

Phase 4. Expand the envelope

Agents are given control of a task incrementally as they learn more about it. Write access is given to one agent at a time. Once each agent is stable enough, all the agents are run in parallel. Agents should be expanded to the adjacent workflows in the same way that instrumentation and guardrails are used in Phase 3.

The enterprise that acts

The organizations that will win in the next decade of enterprise performance are taking important steps now. They're building automation platforms, determining the integrations needed, defining the governance rules required to maintain order and designing the business processes that enable high-value work while automation supports and augments through scalable and repeatable workflows.

Our first real foray into creating a new interface to the business was the chatbot. Our first foray into creating a new operating model for work is the agentic enterprise. It's time to stop chatting and start building.

Ankita Devadiga is a team lead and technical product owner with over nine years of experience at Accenture, where she drives technology strategy and product delivery with precision. She holds a master's degree in computer applications from SNDT Women's University, Mumbai, India. With a strong foundation in both technical expertise and leadership. Ankita brings a sharp, informed perspective to everything she writes about.

Next Steps

How agentic AI is changing work, strategy and competitiveness

Real-world agentic AI examples and use cases

Top AI chatbot privacy concerns and how to mitigate them

Evaluate AI chatbots for business use cases

Dig Deeper on AI technologies