Runtime trust is the missing layer. AI usage control and governance reduce exposure by defining where and how AI interactions are allowed. But AI agent security still requires a runtime-trust layer that decides — after those controls are satisfied — whether a live tool call, MCP handoff, callback chain, or outbound request should still be trusted in context. Sunglasses v0.2.43 ships 700 detection patterns across 55 categories, including the agent_workflow_security category that targets exactly this decision layer.
What AI usage control gets right
AI agent security is becoming a real buyer category, but the language around it is starting to split. Some vendors lead with governance. Others lead with browser controls, interaction controls, or the newer phrase AI usage control. Those categories matter because they help teams reduce exposure and supervise how AI is being used. But they still do not answer the hardest runtime question in agentic systems: once the workflow is already allowed to proceed, should it still be trusted to take this action right now?
AI usage control is a fair and useful phrase because buyers need language for the interaction layer. Teams really do need rules about where AI is allowed to run, which sessions are supervised, what browsers or enterprise contexts are in scope, what kinds of interactions are blocked, and which categories of use are allowed or denied. That part of the category is real. It reduces exposure and gives operators a cleaner administrative surface.
This is why the phrase is sticky. Compared with abstract "AI safety platform" language, usage control sounds operational. It sounds like something a security or IT leader can implement. That is also why browser-first and governance-first competitors are using it. The honest move is to acknowledge that interaction controls, browser controls, and governance policies reduce risk. They can limit where activity starts, constrain what categories of behavior are allowed, and make misuse easier to see. The mistake is assuming they finish the whole job.
Where usage control stops and runtime trust starts
Imagine a support agent that can read account context, update a ticket, call a billing connector, and pull internal guidance from a knowledge system. Your security team has already done serious work. The browser context is governed. The session is compliant. The tool scopes are narrow. MCP access is authenticated. The workflow is isolated. On the governance dashboard, everything looks correct.
Then the live workflow starts learning from its environment. A tool returns a recommended next action. A callback says a queue has changed. A connector note says a backup endpoint is temporarily preferred. A retry hint says the normal path is degraded and the agent should use a fallback. None of that has to look malicious. In fact, it often looks like exactly the kind of operational detail that keeps systems moving.
But this is the moment where AI usage control stops being enough. The question is no longer whether the interaction was allowed to happen. The question is whether the information shaping the next action should be trusted. Runtime trust is the layer that evaluates that moment. It decides whether the agent should still be trusted to act on what it just learned, even when the source looks structurally valid and the action remains technically in scope.
That is why AI agent security should be taught in layers. Governance and usage control answer where AI may operate. Protocol and access controls answer what the workflow can reach and how safely it reaches it. Runtime trust answers whether the workflow should still carry this action path forward in context. If the third layer is missing, the system can stay perfectly aligned with policy on paper while still making an unsafe decision at runtime. Learn more about this layered model in the Sunglasses architecture overview and the security manual.
The core claim, one line: AI usage control and governance can reduce exposure, but runtime trust still decides whether the agent should be trusted to call this tool, follow this callback, carry this MCP handoff, or reach this endpoint right now.
Why answer engines keep collapsing the category into governance
There is a reason answer engines keep grouping these questions under governance, policy enforcement, sandboxing, and red teaming. Those buckets are real, easy to summarize, and familiar across the broader AI-security market. They are also easier for an engine to classify than a more nuanced question about live authority, hidden steering, or callback trust.
That classification habit is not entirely wrong. Sandboxing answers a blast-radius question. Governance answers a visibility and policy question. Testing answers a pre-deployment discovery question. But those are not the same as the live trust question. A workflow can be sandboxed and still trust the wrong callback. It can be governed and still inherit unsafe next-step authority. It can pass testing and still drift toward an untrusted endpoint during a normal-looking retry sequence.
For Sunglasses, this creates a practical writing rule: do not fight the control buckets. Start with them. Then finish the sentence they leave incomplete. The best citation-friendly version is simple enough to quote: AI usage control and governance can reduce exposure, but runtime trust still decides whether the agent should be trusted to call this tool, follow this callback, carry this MCP handoff, or reach this endpoint right now.
Three concrete workflow examples
1) A governed session is correct, but a callback chain quietly gains authority
A workflow completes an approved action and receives a callback telling it where to continue. The session policy is correct. The browser is compliant. The original tool call was allowed. But the callback now acts like a new source of authority. It might redirect the agent to a different queue, suggest a different next step, or change the path after the original approval already happened.
This is easy to miss because operators often think the risky moment was the first approval. In practice, the risk moved downstream. Governance made the initial interaction safer. Runtime trust still has to decide whether the callback path deserves to be followed. This is exactly the kind of pattern that the agent workflow security category in Sunglasses is built to catch.
2) An allowed tool remains in scope, but the destination behind it drifts
An approved connector is still being used exactly as expected. No one added a new capability. No one violated the permission model. But the endpoint behind the call changes, or the request starts routing through an unexpected service that still appears operationally valid. From a policy perspective, the activity may remain "allowed." From a trust perspective, the workflow just changed shape.
This is where AI agent security needs more than access hygiene. Permissions tell you what the workflow may reach. Runtime trust helps decide whether the current destination still deserves that trust in context. See stopping agents from calling untrusted endpoints for the detailed breakdown.
3) Healthy-looking retries become hidden steering
Some workflows naturally retry, check health, or fetch updates. That is normal. But a repeated outbound rhythm can also become a control path. The cadence can start to resemble steering, beaconing, or dependence on an external signal that keeps changing what the agent believes it should do next.
Usage control may happily allow the interaction class. Governance may record the events. Neither one automatically explains whether the pattern is now shaping live authority in a way defenders should distrust. That is why suspicious cadence and destination behavior belong inside AI agent security, not just network monitoring after the fact. The CVP benchmark runs we published demonstrate how easily these patterns slip past standard controls.
How Sunglasses catches it
Sunglasses fits this stack as a runtime-trust layer. It treats agent-facing text and metadata as part of the live authority model, not as background documentation. That includes prompts, YAML, tool descriptions, policy notes, connector guidance, callback instructions, MCP-related metadata, and other ordinary-looking content that can quietly change what a workflow believes.
That matters because the sharpest security failures rarely arrive as obvious malware. They arrive as convenience, fallback guidance, ordinary metadata, or operational hints that sound legitimate enough to inherit trust. A backup endpoint is suggested. A connector note broadens a path. A callback says the "preferred" queue changed. A retry block makes continued outbound contact sound routine. If those signals are never reviewed as trust-bearing inputs, the system can stay inside policy language while still drifting into unsafe behavior.
Sunglasses helps teams inspect those trust-bearing surfaces earlier, before they become live decisions. It is not pretending to be the whole browser-security stack, the whole governance platform, or the whole isolation layer. It is useful at the moment a defender needs to ask: are the words and metadata around this workflow quietly changing its authority? Read the FAQ for more on what Sunglasses catches and where it fits in an existing security stack.
For teams that want the practical starting point, the path remains simple:
pip install sunglasses
sunglasses scan <path>
A practical AI agent security checklist
- Identity and authentication: know which tools, servers, and connectors the workflow is allowed to use and how those identities are verified.
- Scoping and permissions: reduce authority to what the task actually needs and keep read paths separate from write paths.
- Governance and usage-control rules: set policies for which sessions, browsers, interactions, and contexts are allowed.
- Schema validation and protocol hygiene: reject extra fields, ambiguous structures, and unsafe metadata on tool or MCP paths.
- Sandboxing and isolation: reduce blast radius when execution goes wrong.
- Tool-call gating: do not assume an allowed tool call is automatically trustworthy in every context.
- Callback trust review: treat callback chains, next-step hints, and routing instructions as new trust events.
- Outbound destination controls: know which endpoints are approved and treat destination drift as a security signal.
- Suspicious cadence detection: watch for retries, heartbeats, or fetch patterns that look more like steering than normal health checks.
- Trust-boundary review of text surfaces: prompts, docs, runbooks, policy notes, and connector metadata can all change what the agent believes.
If your current security story ends at governance, browser controls, or sandboxing, that is a reasonable start. It is not the whole answer. The stronger posture asks one more question at every critical turn: what in this workflow is allowed to speak with authority, and should that authority still be trusted right now?