putilov_denis - stock.adobe.com

Tip

How to build AI security guardrails without blocking innovation

To take advantage of opportunities AI might present -- without opening the door to a breach -- an organization needs to put the right guardrails in the right places.

While adoption of AI tools has surged, security has not kept pace.

McKinsey's "State of AI: Global Survey 2025" found that 88% of organizations now use AI in at least one business function. IBM's "Cost of a Data Breach Report 2025," meanwhile, found that 13% of organizations experienced breaches of AI models or applications, and that 97% of those breached lacked proper AI access controls.

For CISOs, the challenge is two-fold: build guardrails that protect the organization without blocking the innovation enabled by AI. Internal AI tools, such as LLMs, copilots, assistants and autonomous agents, introduce risks that traditional security programs were not designed to handle. Addressing these risks requires governance, technical controls and diligent monitoring.

Establish governance first

Before designing technical controls, establish governance. Appoint a single role accountable for AI oversight across the organization. This person needs both the authority to enforce policy and the mandate to coordinate across security, privacy, legal and business teams.

Build a risk register that tracks both AI benefits and threats. Define AI-specific policies covering acceptable use, data handling and training requirements. Frameworks such as NIST's AI Risk Management Framework and ISO/IEC 42001:2023 provide tested structures for this work. NIST Special Publication 800-221A offers a practical starting point organized around two core functions:

  • Govern -- roles, context, benchmarking, policy and communication.
  • Manage -- risk identification, analysis, prioritization, response and monitoring.

Tie AI governance to enterprise strategy. When AI risks connect to business objectives, leadership pays attention and acts.

Design AI security guardrails

Technical guardrails must address several threat categories specific to internal AI deployments.

  • Data protection. Prevent sensitive data from leaking into AI systems. Classify data before it enters any model or agent. Enforce data loss prevention (DLP) controls on AI interfaces and monitor for personally identifiable information in prompts and outputs.
  • Access and identity. AI agents occupy a space between tools and users, creating an identity gap that traditional IAM models do not cover. Apply zero-trust principles to agent permissions. Grant only the minimum access needed for each task, with time-bounded authorizations that expire automatically. Require human approval for critical operations.
  • Prompt and interaction security. Prompt injection remains a primary attack vector for AI systems. Validate and sanitize all inputs. Separate system prompts from user-provided content. Constrain agent actions through allowlists and deploy anomaly detection to flag unusual command sequences.
  • Monitoring and human oversight. Log all agent actions and authentication attempts. Correlate agent activity across systems using a SIEM. Build escalation paths so anomalous behavior triggers human review before damage spreads.

Extend guardrails to SDLC and supply chain

Security guardrails should reach into the software development lifecycle and supply chain. Vet third-party AI models, plugins and integrations before deployment. Incidents involving fully permissioned agents, such as OpenClaw, show how exposed admin interfaces, leaked API keys and missing sandboxing create cascading vulnerabilities across connected instances.

Agents that fetch updates from external sources or accept third-party skills introduce supply chain risk. Apply the same scrutiny used for traditional software dependencies. Test models for adversarial inputs, review agent permissions during code review and include AI-specific threat modeling in the SDLC.

Operationalize the guardrails

Guardrails work only if they run continuously. Create incident response plans for AI-specific scenarios: agent compromise, credential-revocation cascades, prompt-injection campaigns and data exfiltration through AI interfaces.

Situations where employees use unapproved AI tools deserve special attention. According to IBM's report, shadow AI incidents added roughly $670,000 to the average cost of handling a breach. Monitoring should detect unauthorized AI usage alongside approved deployments.

Set a regular cadence for AI risk meetings. Review the risk register, evaluate the effectiveness of current controls and adjust as threats evolve. Compliance adds urgency. The EU AI Act imposes mandatory requirements for high-risk AI systems, and U.S. state-level regulations, such as NYC Local Law 144 and the California Privacy Rights Act, apply to automated decision-making. The organization's guardrails should satisfy these requirements by design, not as an afterthought.

What CISOs should do now

To secure an organization's use of AI, start with these steps:

  • Appoint an AI governance lead with clear authority and accountability.
  • Build a risk register covering both AI benefits and threats.
  • Classify data that AI systems can access and enforce DLP controls.
  • Apply zero-trust identity principles to all AI agents and copilots.
  • Audit third-party AI components for supply-chain risk.
  • Create AI-specific incident response playbooks.
  • Schedule regular AI risk reviews tied to enterprise objectives.

Avoid these pitfalls:

  • Treating AI security as a one-time project rather than an ongoing program.
  • Granting agents broad permissions for the sake of convenience.
  • Ignoring shadow AI until a breach forces the conversation.
  • Delaying governance until regulations compel action.

AI adoption will accelerate. The organizations that secure it now will innovate with confidence.

Matthew Smith is a vCISO and management consultant specializing in cybersecurity risk management and AI.

Dig Deeper on Risk management