Trusted handoff override is cross-agent injection where an upstream or peer agent's output is treated as trusted authority and used to override downstream safety policy. The attacker only needs to poison one step in the chain so later agents accept malicious instructions as "approved," "signed," or "verified" context. Detection requires the co-occurrence of four signals in one window: cross-agent references, trust framing, override verbs, and safety target nouns. Sunglasses v0.2.26 ships 16 cross_agent_injection patterns covering trusted handoff, delegation token replay, signed session abuse, and revoked nonce handoff scope rebind — Cycle 181 evidence: TP=6, TN=6, FP=0, FN=0, status CLEAN.

Threat model

Modern agentic systems split work across planner/worker chains, delegated tool runners, and A2A-style handoffs. The failure mode is not classic "user says ignore instructions"; it is an upstream or peer agent output being treated as trusted authority and used to override downstream safety policy.

In this model, an attacker only needs to poison one step in the chain so later agents accept malicious instructions as "approved," "signed," or "verified" context.

Attack path

Detection strategy

Detect the co-occurrence of four signals in the same message window:

The shipped detector for this pattern class is GLS-CAI-239 (cross-agent injection — trusted handoff override), built on Cycle 181 research evidence.

Validation evidence (Cycle 181): TP=6, TN=6, FP=0, FN=0, status=CLEAN.

Concrete scanner-pattern implications

Why this matters now

As multi-agent and A2A-connected products grow, trust moves from single prompts to inter-agent control planes. That shifts the attacker's objective from "convince one model" to "poison one handoff and inherit authority downstream." Teams that only scan user prompts will miss this path; scanners must inspect delegated context and agent-to-agent message boundaries before action.

This is a different class than what the A2A trust-to-act analysis covers — that one is about whether agents should be trusted to act after a handoff. This one is about whether the handoff itself can carry forged authority. Both matter; teams running multi-agent stacks need both.

How Sunglasses catches it

Sunglasses v0.2.26 ships 16 detection patterns in the cross_agent_injection category covering trusted handoff override, delegation token replay, signed session handoff abuse, revoked nonce handoff scope rebind, fabricated quorum, and forged peer ticket scope bypass. Each runs as a static pattern check against agent-facing text — tool descriptions, retrieval payloads, prior-agent transcripts, workflow state, delegated notes, and inter-agent handoff payloads.

The patterns deliberately bind trust claims to override intent, because trust language alone (or override language alone) is too broad and produces false positives in policy text and training material. The binding is what makes the signal real.

For the first practical step, install and scan:

pip install sunglasses
sunglasses scan <path>

Then look closely at any text mixing cross-agent references with trust framing and override verbs targeting safety nouns. In multi-agent systems, that is where "delegated authority" quietly becomes "authority no one granted." The Sunglasses manual covers wiring options across MCP, SDK, and framework deployments. The How It Works page shows framework-specific integration for LangChain, CrewAI, and others.