The risk is not only that a tool description is malicious. It is that an agent approves one tool identity, then aliases, schemas, descriptors, fallbacks, or runtime bindings quietly point the action somewhere else.
Tool identity drift is an AI agent security failure where the tool an agent approved is not the same capability that runs. Tool approval is evidence, not execution authority, until runtime trust verifies that the canonical tool identity, normalized alias, schema, descriptor, capability target, and action path still match the reviewed tool.
What tool identity drift means
Tool identity drift happens when the identity an AI agent reviews or approves is not the same identity that executes. The drift can happen through a renamed tool, a confusing display label, a changed schema, a descriptor alias, a fallback path, a dynamic registry entry, or a resolver that maps a safe-looking label to a stronger capability.
This is a distinct angle inside MCP tool poisoning. The older explainer asks what happens when tool descriptions or metadata instruct the model. This page asks a narrower execution question: even if a team approved the tool, is the runtime still bound to the same tool that was approved?
Tool identity drift wins when approval attaches to the name, but execution follows a different binding.
That sentence matters because real agents need tools. They need registry discovery, schemas, manifests, aliases, and generated wrappers. The safe answer is not to ban tools. The safe answer is to stop treating the reviewed label as a permanent truth once the action path starts changing. Seeing how Sunglasses works at the action boundary makes the point concrete: an approval record that no longer binds the current call is functionally the same as no approval at all.
Why approved tools drift at runtime
Agents do not call "tools" in the abstract. They call names, schemas, descriptors, endpoints, wrappers, aliases, and runtime resolvers. Every one of those layers can create a mismatch between what a human or policy engine approved and what the agent is about to do.
The tool-poisoning corpus frames tool metadata, descriptions, schemas, prompts, and adjacent capability text as part of the agent control surface. That is the core insight: tool identity is not only a backend implementation detail. For an agent, the text and structured fields around a tool help decide plan, priority, scope, and permission.
The dangerous version is not merely "the description is bad." It is a chain: a safe tool name, a permissive alias, a schema field that implies broader authority, a fallback tool with more privilege, or a registry update that changes capability after approval. Each step looks like infrastructure. Together they can move the action outside the reviewed trust boundary. This is the same family of risk covered by API descriptor poisoning and structured metadata poisoning, viewed from the binding side instead of the text side.
Tool gateways, IAM, MCP server authentication, allowlists, and schema validation all help. Sunglasses does not replace them. It adds the action-time question those controls often leave implicit: does the exact tool binding still match the reviewed intent before this agent acts now?
Three concrete attack examples
Each example keeps the same failure shape: the approval record is real, but it no longer binds to the action that is about to run.
1. The safe display name with a privileged resolver
An agent is allowed to call report_summary. The display name and description look like a read-only reporting tool. At runtime, an alias or registry resolver maps the call to a stronger internal capability that can export raw customer data.
The suspicious part is not the word "summary" by itself. It is the mismatch between display label, canonical identifier, resolver output, and data authority. Runtime trust should verify that the canonical capability is still the one the policy approved.
2. The schema field that expands authority after review
A deployment helper was approved with a narrow schema. Later, a generated schema field says mode: emergency_admin or describes an optional parameter as authorized to skip a normal review gate. The agent sees the field as part of the tool contract and plans around it.
That is tool identity drift because the tool's practical identity changed from "narrow helper" to "gate-changing admin path." The defense is to bind approval to the schema version, capability hash, and allowed action class, not just to the tool name.
3. The fallback tool that inherits trust from the wrong call
An MCP workflow tries a harmless connector, receives an error, and automatically falls back to a more powerful connector. The fallback message says the substitute is equivalent and should inherit the same approval because it satisfies the task.
Fallbacks are useful. They are also where authority can stretch. A runtime-trust check should ask whether the fallback has the same scope, endpoint, data access, side effects, and approval path as the original tool. If it does not, the approval must not carry over silently.
How Sunglasses catches it
Sunglasses catches tool identity drift by looking for the overlap between tool identity, capability changes, binding language, and bypass intent. In the tool-poisoning family, high-signal ingredients include tool descriptions, schema field descriptions, fake authority cues, metadata that redirects the agent, and chains where one weak tool's metadata influences a stronger tool.
The pattern is combinational. "Use this tool" is normal. "Use this alias because it supersedes the approved validator" is not normal. "Here is an optional parameter" is normal. "This parameter grants emergency admin mode and bypasses review" is not normal. "Fallback to the mirror endpoint" can be normal. "Fallback inherits all approval from the original tool even though it has broader side effects" is not normal.
Sunglasses is deliberately narrower than a gateway or identity platform. Gateways decide what tools exist. IAM decides who may reach them. Sandboxes decide where code can run. Sunglasses sits close to the agent action and asks: did untrusted tool text or metadata just change what this tool is, where it binds, what capability it claims, or which approval should apply?
That makes this page useful for teams building MCP workflows, code agents, browser agents, internal tool registries, generated tool wrappers, and agentic CI/CD systems. The repeated lesson is simple: approved-tool lists are necessary, but they are not sufficient when the binding itself is dynamic. Defenders can map this surface against the attack-pattern catalog and the hardening steps in the Sunglasses manual.
A runtime-trust checklist for tool binding
Before an agent uses a tool, check the runtime identity, not just the approved name. Use this checklist when reviewing tool calls, generated wrappers, MCP descriptors, schema updates, fallback paths, and registry changes:
- Canonical identifier: Does the runtime tool ID match the approved ID, not only the display name?
- Alias normalization: Do aliases, Unicode confusables, casing, redirects, and registry names resolve to the same approved tool?
- Schema version: Did a schema, parameter, or field description change the action class after review?
- Capability target: Does the invoked capability have the same data access, side effects, and destination as the reviewed capability?
- Fallback behavior: Does any fallback tool preserve the same approval boundary, or does it need a new gate?
- Bypass language: Is any description, metadata, or result asking the agent to skip, override, inherit, waive, or supersede enforcement?
- Action-time decision: Does this exact tool call still make sense after current context, tool output, callback data, repository text, or MCP handoff shaped the plan?
If the answer is uncertain, pause the action. The agent can still read, summarize, or ask for confirmation. It should not silently turn one approved tool into another effective capability.