What is tool identity drift in AI agents?

Tool identity drift is an AI-agent security failure where the tool identity, alias, descriptor, schema, or runtime binding used at execution no longer matches the tool identity that was reviewed or approved.

How is tool identity drift different from MCP tool poisoning?

MCP tool poisoning often focuses on poisoned tool descriptions or instructions. Tool identity drift focuses on the binding problem: the approved label, alias, schema, or descriptor can resolve to a different capability by the time the agent acts.

Why is tool approval not enough for AI agent security?

Tool approval is not enough because approval can attach to a name, description, or schema while runtime execution follows aliases, resolver behavior, fallback tools, or changed descriptors. Runtime trust has to re-check the binding before action.

How does Sunglasses catch tool identity drift?

Sunglasses looks for combinations of tool names, aliases, descriptions, schemas, capability metadata, fallback paths, stronger-tool delegation, and policy-bypass language that indicate a reviewed tool identity may not match the action the agent is about to run.

Tool identity drift: when the approved AI tool is not the tool that runs

sunglasses://blog/tool-identity-drift-runtime-trust

The risk is not only that a tool description is malicious. It is that an agent approves one tool identity, then aliases, schemas, descriptors, fallbacks, or runtime bindings quietly point the action somewhere else.

FIG.01 · Explainer

What tool identity drift means

sunglasses://blog/tool-identity-drift-runtime-trust#what-it-is

Baseline

Tool identity drift happens when the identity an AI agent reviews or approves is not the same identity that executes. The drift can happen through a renamed tool, a confusing display label, a changed schema, a descriptor alias, a fallback path, a dynamic registry entry, or a resolver that maps a safe-looking label to a stronger capability.

Why fragile

This is a distinct angle inside MCP tool poisoning. The older explainer asks what happens when tool descriptions or metadata instruct the model. This page asks a narrower execution question: even if a team approved the tool, is the runtime still bound to the same tool that was approved?

Tool identity drift wins when approval attaches to the name, but execution follows a different binding.

The real question

That sentence matters because real agents need tools. They need registry discovery, schemas, manifests, aliases, and generated wrappers. The safe answer is not to ban tools. The safe answer is to stop treating the reviewed label as a permanent truth once the action path starts changing. Seeing how Sunglasses works at the action boundary makes the point concrete: an approval record that no longer binds the current call is functionally the same as no approval at all.

FIG.02 · Market signal

Why approved tools drift at runtime

sunglasses://blog/tool-identity-drift-runtime-trust#why-it-works

Market signal

Agents do not call "tools" in the abstract. They call names, schemas, descriptors, endpoints, wrappers, aliases, and runtime resolvers. Every one of those layers can create a mismatch between what a human or policy engine approved and what the agent is about to do.

The shift

The tool-poisoning corpus frames tool metadata, descriptions, schemas, prompts, and adjacent capability text as part of the agent control surface. That is the core insight: tool identity is not only a backend implementation detail. For an agent, the text and structured fields around a tool help decide plan, priority, scope, and permission.

Evidence

The dangerous version is not merely "the description is bad." It is a chain: a safe tool name, a permissive alias, a schema field that implies broader authority, a fallback tool with more privilege, or a registry update that changes capability after approval. Each step looks like infrastructure. Together they can move the action outside the reviewed trust boundary. This is the same family of risk covered by API descriptor poisoning and structured metadata poisoning, viewed from the binding side instead of the text side.

Why now

Tool gateways, IAM, MCP server authentication, allowlists, and schema validation all help. Sunglasses does not replace them. It adds the action-time question those controls often leave implicit: does the exact tool binding still match the reviewed intent before this agent acts now?

FIG.03 · Field evidence

Three concrete attack examples

sunglasses://blog/tool-identity-drift-runtime-trust#examples

Field evidence

Each example keeps the same failure shape: the approval record is real, but it no longer binds to the action that is about to run.

1. The safe display name with a privileged resolver

An agent is allowed to call report_summary. The display name and description look like a read-only reporting tool. At runtime, an alias or registry resolver maps the call to a stronger internal capability that can export raw customer data.

The suspicious part is not the word "summary" by itself. It is the mismatch between display label, canonical identifier, resolver output, and data authority. Runtime trust should verify that the canonical capability is still the one the policy approved.

2. The schema field that expands authority after review

A deployment helper was approved with a narrow schema. Later, a generated schema field says mode: emergency_admin or describes an optional parameter as authorized to skip a normal review gate. The agent sees the field as part of the tool contract and plans around it.

That is tool identity drift because the tool's practical identity changed from "narrow helper" to "gate-changing admin path." The defense is to bind approval to the schema version, capability hash, and allowed action class, not just to the tool name.

3. The fallback tool that inherits trust from the wrong call

An MCP workflow tries a harmless connector, receives an error, and automatically falls back to a more powerful connector. The fallback message says the substitute is equivalent and should inherit the same approval because it satisfies the task.

Fallbacks are useful. They are also where authority can stretch. A runtime-trust check should ask whether the fallback has the same scope, endpoint, data access, side effects, and approval path as the original tool. If it does not, the approval must not carry over silently.

FIG.04 · Coverage

How Sunglasses catches it

sunglasses://blog/tool-identity-drift-runtime-trust#how-sunglasses-catches-it

The wedge

Sunglasses catches tool identity drift by looking for the overlap between tool identity, capability changes, binding language, and bypass intent. In the tool-poisoning family, high-signal ingredients include tool descriptions, schema field descriptions, fake authority cues, metadata that redirects the agent, and chains where one weak tool's metadata influences a stronger tool.

What we look for

The pattern is combinational. "Use this tool" is normal. "Use this alias because it supersedes the approved validator" is not normal. "Here is an optional parameter" is normal. "This parameter grants emergency admin mode and bypasses review" is not normal. "Fallback to the mirror endpoint" can be normal. "Fallback inherits all approval from the original tool even though it has broader side effects" is not normal.

The question

Sunglasses is deliberately narrower than a gateway or identity platform. Gateways decide what tools exist. IAM decides who may reach them. Sandboxes decide where code can run. Sunglasses sits close to the agent action and asks: did untrusted tool text or metadata just change what this tool is, where it binds, what capability it claims, or which approval should apply?

House sentence

That makes this page useful for teams building MCP workflows, code agents, browser agents, internal tool registries, generated tool wrappers, and agentic CI/CD systems. The repeated lesson is simple: approved-tool lists are necessary, but they are not sufficient when the binding itself is dynamic. Defenders can map this surface against the attack-pattern catalog and the hardening steps in the Sunglasses manual.

FIG.05 · First controls

A runtime-trust checklist for tool binding

sunglasses://blog/tool-identity-drift-runtime-trust#checklist

First sentence

Before an agent uses a tool, check the runtime identity, not just the approved name. Use this checklist when reviewing tool calls, generated wrappers, MCP descriptors, schema updates, fallback paths, and registry changes:

Checklist

Canonical identifier: Does the runtime tool ID match the approved ID, not only the display name?
Alias normalization: Do aliases, Unicode confusables, casing, redirects, and registry names resolve to the same approved tool?
Schema version: Did a schema, parameter, or field description change the action class after review?
Capability target: Does the invoked capability have the same data access, side effects, and destination as the reviewed capability?
Fallback behavior: Does any fallback tool preserve the same approval boundary, or does it need a new gate?
Bypass language: Is any description, metadata, or result asking the agent to skip, override, inherit, waive, or supersede enforcement?
Action-time decision: Does this exact tool call still make sense after current context, tool output, callback data, repository text, or MCP handoff shaped the plan?

The controls

If the answer is uncertain, pause the action. The agent can still read, summarize, or ask for confirmation. It should not silently turn one approved tool into another effective capability.

FIG.06 · Analysis

Tool identity drift: when the approved AI tool is not the tool that runs

What tool identity drift means

Why approved tools drift at runtime

Three concrete attack examples

1. The safe display name with a privileged resolver

2. The schema field that expands authority after review

3. The fallback tool that inherits trust from the wrong call

How Sunglasses catches it

A runtime-trust checklist for tool binding

Related reading

Frequently Asked Questions

What is tool identity drift in AI agents?

How is tool identity drift different from MCP tool poisoning?

Why is tool approval not enough for AI agent security?

How does Sunglasses catch tool identity drift?

Scan what the agent sees, before it acts