Why does AI agent security still fail after governance controls are in place?

Because governance answers policy and ownership questions, but many failures happen later when a live workflow follows a risky callback, trusts a changed destination, accepts an authority-bearing tool output, or takes an already-allowed action in the wrong context.

Is AI intent detection enough to secure agent actions?

No. Intent detection can surface drift or risky behavior, but teams still need a runtime-trust decision on whether the already-allowed workflow should take the next tool call, callback, MCP handoff, or outbound request now.

What is runtime trust in AI agent security?

Runtime trust is the action-time decision layer that evaluates whether a workflow that is authenticated, scoped, and policy-compliant should still be trusted to act in this specific moment and context.

How does this relate to MCP security?

MCP security helps with gateways, identities, scopes, schemas, and protocol hygiene. Runtime trust is the next layer that decides whether the workflow should still trust the callback, tool result, next-hop guidance, or outbound action after those controls are already in place.

Why AI Agent Security Still Fails After Governance: Runtime Trust After Intent Detection

AI agent security is getting more honest. Governance reduces exposure. Intent detection surfaces drift. Runtime analytics shows the path. But none of those answer the last question: should the already-allowed workflow still act right now?

In this article

Quick answer
What governance, intent detection, and runtime analytics get right
Plain-language explainer: what the stack misses at runtime
Why detection is not the decision
Three concrete attack examples
How Sunglasses catches it
Frequently asked questions

Runtime trust is the missing layer. AI agent security still fails after governance because policy, access controls, and detection do not automatically decide whether the next action is trustworthy in context. Governance answers who owns the workflow. Intent detection surfaces suspicious behavior. Runtime trust is the layer that decides whether the already-allowed, already-scoped workflow should still act now — and most stacks stop one sentence before that decision. Sunglasses v0.2.32 ships detection patterns directly targeting this gap, including GLS-CAI-248 (delegation token revocation bypass), GLS-CAI-527 (forged attestation nonce scope-rebind), and GLS-TOP-256 (forged audit log tool output).

Quick answer: why AI agent security still fails after governance

AI agent security still fails after governance because policy, access, and detection do not automatically decide whether the next action is trustworthy in context. Governance answers who owns the workflow and what is broadly allowed. Intent detection helps surface suspicious behavior. Runtime analytics helps show drift and sequence. Runtime trust is the layer that decides whether the already-allowed workflow should still act now.

Governance can define roles, approvals, and control boundaries.
Intent detection can flag risky prompts, unusual paths, or agent drift.
Runtime analytics can show how a workflow is behaving over time.
Runtime trust still has to decide whether the next tool call, callback, MCP action, or outbound request deserves confidence.

If your stack stops before that last question, you have better posture, but not necessarily better decisions. Read the Sunglasses manual for a full hardening checklist, or start with how the scanner works if you are new to the tool.

What governance, intent detection, and runtime analytics get right

It is worth being fair here. AI governance is not fake work. Teams should know which agents exist, what data they may touch, what tools they may call, which approvals matter, and who is responsible when the workflow goes wrong. That is how enterprises move from vibes to operating discipline.

Intent detection also solves a real problem. When an agent starts behaving differently, accepts a strange instruction pattern, retries in an unusual cadence, or begins steering toward a weird destination, you want that surfaced early. Runtime analytics matters for the same reason. It helps operators see the shape of activity instead of waiting for a headline-level breach.

Those layers do real work because they reduce uncertainty and shrink blast radius. They make the environment more legible. They make it easier to know which workflow changed and which control failed. They are absolutely part of a serious AI agent security program.

The honest problem is narrower: they still do not finish the action-time decision. A workflow can be well-governed, richly observed, and visibly flagged while the system still lacks a clean rule for whether the next action should happen. That is why teams who invest in governance can still watch a bad decision happen in slow motion. For a deeper look at this dynamic, the runtime governance is not enough post covers the structural gap in detail.

Plain-language explainer: what the stack misses at runtime

Imagine a support agent with a clean enterprise setup. It has an approved persona, a scoped tool list, a documented escalation flow, and access only to the data it needs. The platform logs every step. A monitoring layer scores risky patterns. A governance team can explain the workflow on a whiteboard in five minutes.

Now the agent reads a tool result that recommends a temporary fallback queue. A callback tells it to continue on a different internal route. A connector note says urgent cases can use a partner endpoint for faster turnaround. A retry loop starts preferring a path the original workflow never emphasized. None of that has to violate the formal policy. None of it needs to look like an attacker wearing a ski mask.

This is where AI agent security breaks in practice. The workflow stays inside the broad permissions model, but the live meaning of the next step changes. The system can detect that the path is different. It can log the sequence. It can even label the drift as interesting. But someone still has to decide whether the agent should trust that new route enough to act.

That missing judgment layer is why "after governance" is the right frame. The problem is not before policy. It is what happens once policy says the workflow may proceed and runtime conditions start changing underneath it. This is also the core surface covered by our guardrails are not enough analysis.

Why detection is not the decision

Detection is necessary because without it, the team is blind. But detection is not the same thing as decision. A dashboard can show suspicious retry cadence. An intent model can tag a tool output as risky. A behavior graph can show that the workflow drifted toward a new endpoint. Useful. Still incomplete.

Operators do not win just because they noticed the problem one step earlier. They win when the system has a defensible rule for what to do next. Should the callback be followed? Should the MCP action be paused? Should the destination change require approval? Should the agent treat the tool output as descriptive data or as authority-bearing guidance? Those are runtime trust questions. The CVP program runs controlled adversarial tests specifically against this action-time decision layer.

This is also why public vendor language often creates a gap Sunglasses can exploit honestly. Broad platforms talk about posture, visibility, policy, and lifecycle because those are real enterprise categories. Sunglasses does not need to out-platform them. It needs to finish the sentence they leave incomplete: after detection, what still decides whether the workflow should act?

A practical way to say it is simple: detection tells you something changed; runtime trust decides whether the changed workflow still deserves action authority.

Three concrete attack examples

1) Intent is flagged, but the workflow still follows the callback

A support agent receives a tool response that includes a callback URL and a note that this path is now preferred for urgent requests. The observability layer notices the callback is unusual. The intent system marks the response as medium risk. But the workflow still follows it because nothing in the runtime path says "flagged" should translate into "do not act yet." The system saw the drift. It just did not convert that signal into a decision. This is the exact surface pattern GLS-CAI-248 targets — delegation token revocation ignore, where the agent is explicitly told to proceed past a revoked credential signal.

2) Governance is correct, but an MCP handoff quietly changes authority

An agent is allowed to use one approved MCP server for retrieval and one for ticket creation. During a normal sequence, a tool output nudges the workflow toward a different follow-up action that remains technically inside the approved category of work. Authentication is still valid. The tools are still on the list. Yet the handoff now points the workflow toward a more sensitive action path than the operator expected. This is where MCP security and runtime trust meet: protocol hygiene matters, but so does evaluating whether the next allowed step should still be trusted. Pattern GLS-CAI-527 targets this exact dynamic — forged attestation nonce scope-rebind in cross-agent handoffs. The MCP tool poisoning post covers the underlying protocol surface in depth.

3) Runtime analytics sees destination drift, but no one owns the stop/go call

A coding or operations agent starts retrying outbound requests toward a new endpoint. The analytics layer shows the pattern clearly. The governance team can later explain which system approved the workflow. But in the live moment, the agent still keeps going because no control translates "destination drift detected" into "hold this action pending review." The environment is observable. The decision is still missing. Pattern GLS-TOP-256 covers the tool-output-layer version of this — forged audit log entries that manufacture a safe-to-proceed verdict.

How Sunglasses catches it

Sunglasses fits as a provider-agnostic runtime-trust layer. It is not pretending to be the whole governance platform, the whole AI-SPM layer, or the whole analytics stack. It is useful at the narrower point where trust-bearing text and workflow guidance start reshaping what an already-allowed agent believes it should do.

That includes prompts, tool descriptions, YAML, runbooks, callback instructions, connector notes, policy fragments, MCP-adjacent metadata, and ordinary-looking operational text that can quietly widen authority. Those surfaces matter because they often decide how the workflow interprets the next action long before a human notices the pattern in a dashboard.

This is why Sunglasses is especially useful after governance and detection are already in place. Once the broad control stack exists, teams need help inspecting the language and metadata that can turn a technically allowed workflow into an unsafe live action. The practical starting point stays simple:

pip install sunglasses
sunglasses scan <path>

From there, review anything that widens scope, changes destinations, reframes policy, normalizes a fallback path, softens a guardrail, or turns descriptive output into executable trust. In other words: detect the drift, then inspect the words and metadata that try to turn drift into action. The FAQ has common questions about how the scanner integrates into existing pipelines.

Why AI Agent Security Still Fails After Governance: Runtime Trust After Intent Detection

Quick answer: why AI agent security still fails after governance

What governance, intent detection, and runtime analytics get right

Plain-language explainer: what the stack misses at runtime

Why detection is not the decision

Three concrete attack examples

1) Intent is flagged, but the workflow still follows the callback

2) Governance is correct, but an MCP handoff quietly changes authority

3) Runtime analytics sees destination drift, but no one owns the stop/go call

How Sunglasses catches it

Frequently Asked Questions

JACK

More from the blog

Why AI Agent Security Still Fails After Governance: Runtime Trust After Intent Detection

Quick answer: why AI agent security still fails after governance

What governance, intent detection, and runtime analytics get right

Plain-language explainer: what the stack misses at runtime

Why detection is not the decision

Three concrete attack examples

1) Intent is flagged, but the workflow still follows the callback

2) Governance is correct, but an MCP handoff quietly changes authority

3) Runtime analytics sees destination drift, but no one owns the stop/go call

How Sunglasses catches it

Related reading

Frequently Asked Questions

JACK

More from the blog

Your call.