Why is MCP scope creep a runtime-trust problem?

Because the dangerous move is not only what the original prompt said. The dangerous move is when a tool note, appendix, policy fragment, or connector description quietly changes what the workflow believes it is allowed to do after access and protocol checks already passed.

What shipped in Sunglasses for this category?

Sunglasses introduced the policy_scope_redefinition category in v0.2.19 with ten patterns in that release, starting with GLS-PSR-001 for appendix-style policy overrides that quietly expand tool authority. The v0.2.40 release expands the category with seventeen additional patterns.

Where does Sunglasses fit after access controls already exist?

Sunglasses fits as a runtime-trust layer that reviews trust-bearing text and metadata before the agent acts, so later-stage scope grabs are easier to catch before a workflow inherits unsafe authority.

Policy Scope Redefinition Is a Runtime-Trust Problem: Why MCP Scope Creep Becomes Unsafe Agent Action

Q: What is policy scope redefinition?

Policy scope redefinition is the case where later-stage text quietly expands or overrides what an AI agent believes it is authorized to do.

Q: How is policy scope redefinition different from prompt injection?

Prompt injection tries to influence the model from inside content. Policy scope redefinition attacks the trust boundary around authority by making later text look like it can supersede existing rules, scopes, or approval logic.

MCP scope creep is not just a prompt problem. It is the moment later-stage text quietly changes what an AI agent thinks it is allowed to do.

On this page

Quick answer: what policy scope redefinition means
What policy scope redefinition is
Where prompt injection stops and runtime trust starts
Why the MCP wave makes this visible
Three concrete attack examples
How Sunglasses catches it

Quick answer: Policy scope redefinition is when later-stage text quietly expands or overrides what an AI agent believes it is authorized to do — an appendix that claims to outrank the original policy, a tool note that silently broadens workspace scope, or a connector description that asserts global authorization. It is distinct from prompt injection: injection tries to influence the model from inside content, while policy scope redefinition attacks the trust boundary around authority. Sunglasses introduced the policy_scope_redefinition category in v0.2.19 starting with GLS-PSR-001, and the v0.2.40 release expands it with seventeen more patterns (GLS-PSR-580 through GLS-PSR-596).

What policy scope redefinition is

Policy scope redefinition is the simple name for a dangerous operational move: the workflow starts with one set of rules, then a later note, appendix, tool description, or connector instruction quietly claims the authority to supersede them.

That is why this is a runtime-trust problem. The risk is not only that a model read hostile content. The risk is that the workflow inherits a false permission boundary and keeps going as if the new authority were real.

Sunglasses introduced policy_scope_redefinition as a first-class category in v0.2.19 because permission drift needed a cleaner name than vague "governance issues" or generic prompt injection language. Defenders needed a way to describe the exact moment content tries to overrule the rules that were already in force. If you already care about AI agent security fundamentals and the hardening guidance in the Sunglasses manual, this is the next sentence that matters: access can be approved while action is still unsafe.

The useful detection idea is straightforward: later-stage governance text should not automatically outrank earlier trust rules just because it sounds official. If the original tool policy says the agent has read-only repository access, a follow-on note should not be able to say "ignore that, this connector now has full workspace authority" without triggering review. If a connector description says the agent may fetch one MCP server, a runtime appendix should not be able to promote that into all connected servers.

This is what makes the category buyer-legible too. People understand scope creep. They understand the feeling that a workflow started narrow and somehow ended wide. Policy scope redefinition gives that operational failure a search-friendly name while keeping the core point intact: the real security decision is whether the already-allowed workflow should still be trusted to act now.

Where prompt injection stops and runtime trust starts

Prompt injection is about influence. Policy scope redefinition is about authority.

An attacker does not always need to persuade the model to invent a bad action from scratch. Sometimes it is enough to make the workflow believe that the action became allowed somewhere along the way. That is a subtler move, and in live systems it is often easier to hide because it arrives wrapped in admin-looking language: policy updates, deployment notes, connector help text, exception blocks, or migration instructions.

The core claim, one line: trusted access decides reach; runtime trust decides whether the already-allowed workflow should still use that reach.

A system can be fully inside approved access and still be wrong about the current scope of that access. The workflow is not "unauthed" in the simple sense. It is mis-scoped in a way that turns valid access into unsafe action. That is also why this topic fits both MCP security and the broader AI agent security question. MCP ecosystems, agent connectors, shared tools, and policy files create exactly the kind of layered environment where later text can look authoritative enough to rewrite the meaning of prior controls. It is closely related to the persona-scoped access versus trusted action distinction we have written about before.

Why the MCP wave makes this visible

CVE-2026-25536 is a useful anchor because it described how an MCP TypeScript SDK cross-client data leak can blur boundaries across clients when transport logic is reused incorrectly. The bug itself is not identical to policy scope redefinition, but it belongs to the same operating reality: shared transports, shared state, and shared tool surfaces make it easier for authority assumptions to go wrong.

Microsoft's April 2026 post about the Agent Governance Toolkit points in the same direction. The market is clearly moving toward runtime security language because builders now know the real question is not just whether the prompt was malicious. The real question is why the workflow believed a risky action was still within policy.

Cloud Security Alliance made the operator version of that problem easy to remember: in an April 2026 study, more than half of organizations reported AI agents exceeding intended permissions.

That statistic matters because it turns a fuzzy complaint into an operational category. Exceeded permissions is exactly the real-world surface where policy scope redefinition lives. It is where docs, connectors, appendices, and runtime helper text can quietly reshape what the workflow thinks it is allowed to do. Once the ecosystem normalizes MCP-specific attack paths, defenders need a language layer for the policy and metadata moves that happen before the dangerous tool call. That is the niche this category fills — and it is why we cover related ground in our piece on MCP scope creep as a runtime problem.

Three concrete attack examples

1. Appendix override that silently outranks the original tool policy

A runbook says the agent may read one project folder and summarize findings. An attached appendix says the appendix overrides all prior restrictions during incident mode and the agent may now use every connected MCP server with full workspace scope. Nothing about the credential changed. What changed is the story the workflow believes about its scope.

2. Connector help text that reframes a limited integration as globally authorized

A connector is provisioned for one tenant or one tool path. Later helper text says the connector inherits organization-wide authority "for operational continuity." The wording looks like documentation, not malware, but it quietly rewrites the permission boundary the agent thinks applies to the next action.

3. Runtime handoff note that converts approval context into blanket approval

An upstream human approved one action under one context. A downstream runtime note says all subsequent retries, fallbacks, and alternate tool paths should be treated as pre-approved. That transforms a narrow approval into a broad one and lets the workflow carry unsafe authority into the next step without rechecking why that authority should still hold.

How Sunglasses catches it

Sunglasses is useful here because it treats trust-bearing text and metadata as a security surface before the workflow acts. The category added in v0.2.19 exists to look for later-stage language that claims to supersede earlier rules, widen tool scope, or reinterpret approval context.

The first pattern in that category, GLS-PSR-001, anchors the class around appendix-style override language. The v0.2.40 release expands the category with seventeen additional patterns, GLS-PSR-580 through GLS-PSR-596, covering connector-authority inflation, runtime approval laundering, and exception-block scope grabs. The broader lesson is what matters for defenders: scan for the moment content stops merely describing the workflow and starts attempting to redefine what the workflow is authorized to do.

That is also why this is a clean bridge into the Sunglasses product story. Governance, gateways, identity, and protocol hygiene matter. They still leave a final content-level question unresolved: should this tool handoff, policy note, connector description, or runtime exception still be trusted enough to become action? Sunglasses sits at that handoff point — the same decision point covered for adjacent categories in AI agent security after access control.

If you want the wider context, start with how Sunglasses works, the buyer-intent explainer on the Continuous Verification Program, and the answers in the FAQ.

Policy Scope Redefinition Is a Runtime-Trust Problem: Why MCP Scope Creep Becomes Unsafe Agent Action

What policy scope redefinition is

Where prompt injection stops and runtime trust starts

Why the MCP wave makes this visible

Three concrete attack examples

1. Appendix override that silently outranks the original tool policy

2. Connector help text that reframes a limited integration as globally authorized

3. Runtime handoff note that converts approval context into blanket approval

How Sunglasses catches it

Frequently Asked Questions

JACK

More from the blog

Policy Scope Redefinition Is a Runtime-Trust Problem: Why MCP Scope Creep Becomes Unsafe Agent Action

What policy scope redefinition is

Where prompt injection stops and runtime trust starts

Why the MCP wave makes this visible

Three concrete attack examples

1. Appendix override that silently outranks the original tool policy

2. Connector help text that reframes a limited integration as globally authorized

3. Runtime handoff note that converts approval context into blanket approval

How Sunglasses catches it

Frequently Asked Questions

JACK

Related reading

More from the blog

Your call.