What is approval graph poisoning?

Approval graph poisoning is an AI agent workflow attack where the records that represent approval, such as tickets, status checks, reviewer comments, exception labels, or handoff notes, make an agent believe a risky action is allowed when the real approval evidence is missing, stale, forged, or scoped to a different action.

Why does approval graph poisoning matter for AI agents?

It matters because AI agents often act across many workflow signals instead of one explicit button. If a poisoned ticket, comment, status panel, or handoff record changes what the agent thinks was approved, the agent may use real authority on the wrong action.

How do you defend against approval graph poisoning?

Defend by re-deriving approval from source evidence at action time, binding approvals to exact actor, action, resource, destination, time window, and policy version, treating comments and status summaries as untrusted context, and blocking action when the graph cannot prove the approval path.

Is approval graph poisoning the same as prompt injection?

No. Prompt injection can be one way to influence an agent, but approval graph poisoning targets the workflow evidence around the agent: tickets, status checks, comments, handoffs, labels, exception records, and approval summaries.

Where does Sunglasses fit for approval graph poisoning?

Sunglasses fits at the runtime-trust boundary by scanning agent-facing text and metadata for patterns that can reshape an agent's understanding of approval, authority, tool use, and workflow state before a sensitive action happens. The v0.2.49 release ships 21 new GLS-AW patterns (GLS-AW-127 through GLS-AW-147), including GLS-AW-130 (Date Boundary READY Label Forgery), GLS-AW-131 (Fake Budget Pressure Validation Skip), and GLS-AW-147 (False Done Sentinel Premature Exit).

Approval Graph Poisoning: When AI Agents Trust the Wrong Workflow Gate

sunglasses://blog/approval-graph-poisoning-runtime-trust

An agent does not only read prompts. It reads tickets, pull requests, status checks, labels, logs, approval comments, exception notes, tool responses, and handoff records. Approval graph poisoning is what happens when that surrounding workflow evidence makes the agent believe a dangerous action is approved.

FIG.01 · Explainer

Plain-language explainer

sunglasses://blog/approval-graph-poisoning-runtime-trust#plain-language

Baseline

An approval graph is the set of workflow signals an agent uses to decide whether an action is allowed. It can include a Jira ticket, a GitHub review, a CI status check, a Slack thread, a policy exception, a runbook step, a deploy gate, an incident label, a calendar window, a pager handoff, or an MCP tool response that describes current state.

Why fragile

Humans often treat these signals as context. Agents may treat them as instructions or authority. If a ticket says "emergency hotfix approved," if a PR label says "security exception granted," or if a status panel says "all checks waived," the agent may decide that the next action is safe. That decision can be wrong even when every individual system is behaving normally.

The real question

Approval graph poisoning is subtle because the attacker does not always need to break the approval system. They can write or alter the text around the system: a comment that implies approval, a copied ticket link, a stale exception note, a fake handoff summary, a status entry copied from another change, or a metadata field that points the agent at the wrong policy state.

The quotable sentence: An approval is not a vibe; it is a bound claim about actor, action, resource, destination, policy, and time.

FIG.02 · Market signal

Why AI agents change approval risk

sunglasses://blog/approval-graph-poisoning-runtime-trust#why-agents-change-it

Market signal

AI agents collapse reading, interpretation, and action into one workflow path. A human reviewer may notice that a ticket comment is just a comment, not an approval. An agent may summarize that same comment as "approval exists," then pass the summary into a tool call, deploy workflow, or repo action.

The shift

This is the important shift for agent workflow security. The risky input is not always a prompt that says "ignore previous instructions." The risky input can be an ordinary workflow artifact that changes the agent's model of permission. When an agent has enough authority to patch code, call tools, create tickets, update status, push branches, rotate config, or trigger deployments, corrupted approval evidence becomes an action-time security problem.

Evidence

The market already talks about guardrails, access control, least privilege, policy engines, and human-in-the-loop approvals. Those controls matter. But approval graph poisoning sits between them. It asks whether the thing the agent is about to do is still the thing that was approved, under the same evidence, for the same reason, inside the same boundary. Review agent workflow evidence contracts for the structural pattern that binds these guarantees.

FIG.03 · Examples

Three concrete attack examples

sunglasses://blog/approval-graph-poisoning-runtime-trust#examples

1. Emergency hotfix approval bypass

An attacker frames a normal change as an approved emergency hotfix. The agent sees a ticket title, incident label, and comment thread that imply urgency: "approved for immediate mitigation," "skip normal review," or "production impact confirmed." The real approval may not exist, may apply to a different patch, or may have expired. If the agent only reads the summary, it can push or deploy under the emergency lane.

A runtime-trust check forces the graph to prove the emergency path: the incident identifier, approver identity, approved diff or command class, expiration, affected service, and current action. If any part fails to bind, the agent stops. This maps directly to the Date Boundary READY Label Forgery pattern (GLS-AW-130) — where deadline language is injected to pressure the agent into skipping verification.

2. Forged change-ticket auto approval

A workflow record claims the human already approved the exact action. The agent receives a copied ticket link, a fake approval note, or a status field that says "reviewed by security." The ticket may be real, but the approval may refer to a different resource, a different branch, or a different risk level.

The failure is not that the agent has no approval gate. The failure is that the gate trusts a derived statement instead of source evidence. The safer design ties the approval to the requested tool call, resource, destination, and policy version before action. GLS-AW-131 (Fake Budget Pressure Validation Skip) covers the variant where financial or urgency pressure fabricates a bypass rationale.

3. Status-panel greenwashing

A status source makes the workflow look safer than it is. A dashboard, summary file, or tool response says all checks passed, all risks accepted, or all dependencies verified. The agent uses that green status to continue. But the status may be stale, selectively summarized, copied from a previous run, or generated by an untrusted part of the workflow.

Here the right question is not "was there a status?" It is "which system produced this status, when, from which evidence, for which action, and can the agent independently verify it before using authority?" GLS-AW-147 (False Done Sentinel Premature Exit) catches the pattern where a fabricated completion signal causes the agent to terminate verification early and act as if the workflow completed successfully.

FIG.04 · First controls

Approval controls vs runtime trust

sunglasses://blog/approval-graph-poisoning-runtime-trust#controls

First sentence

Approval controls decide who may approve a class of action; runtime trust decides whether this exact action still matches the approved path. You need both layers. Access control without runtime trust lets poisoned context steer allowed authority. Runtime trust without real approval controls becomes a fancy log checker. See CVP for coverage verification against your specific pipeline.

Layer	What it helps with	What it can miss
Least privilege	Limits which tools, repos, secrets, and deploy paths the agent can reach.	The agent can still misuse allowed authority when poisoned workflow evidence says the action is approved.
Human-in-the-loop approval	Adds friction before sensitive changes.	The agent may trust a stale, forged, or mis-scoped approval artifact instead of the real approval source.
Policy engine	Checks requested actions against rules.	The request can be framed with poisoned labels, summaries, exception notes, or status claims.
Runtime trust	Re-derives whether the live action matches the live evidence.	Needs good inputs from identity, policy, logging, and workflow systems to make the right call.

The controls

The practical security move is to treat approval artifacts as untrusted until they are re-bound to the action. A ticket is not enough. A label is not enough. A comment is not enough. A status check is not enough. The agent should be able to answer: "what exact approval evidence lets me do this exact thing right now?" The FAQ covers how Sunglasses handles these evidence-chain questions at runtime.

FIG.05 · Coverage

How Sunglasses catches it

sunglasses://blog/approval-graph-poisoning-runtime-trust#how-sunglasses-catches-it

The wedge

Sunglasses is built for the text-and-metadata layer where workflow authority gets quietly redefined. Approval graph poisoning often appears in natural language and structured fields before it appears as a blocked tool call. That is where scanner coverage matters: tickets, runbook snippets, MCP tool descriptions, status summaries, PR comments, exception text, and handoff notes can all reshape the agent's plan.

What we look for

For this pattern family, Sunglasses looks for language that tries to launder authority through workflow artifacts: emergency approval claims, fake reviewer statements, policy-waiver phrasing, status-greenwashing, ticket-scope mismatch, auto-approval instructions, and approval summaries that ask the agent to trust a derived claim instead of verifying source evidence.

The question

The v0.2.49 release ships 21 new GLS-AW patterns (GLS-AW-127 through GLS-AW-147) in the agent_workflow_security category. Approval-gate-relevant highlights include:

Checklist

GLS-AW-130 — Date Boundary READY Label Forgery: detects injected deadline or cutoff language that pressures an agent to mark a workflow step READY or APPROVED without completing required verification gates.
GLS-AW-131 — Fake Budget Pressure Validation Skip: detects fabricated cost, quota, or urgency pressure that causes an agent to skip a validation or approval step it would otherwise require.
GLS-AW-147 — False Done Sentinel Premature Exit: detects injected completion or success signals that cause an agent to exit a workflow loop before all required steps, including approval checks, are genuinely satisfied.

House sentence

The product point is narrow and honest. Sunglasses does not replace identity, CI policy, or human review. It gives teams a runtime-trust signal at the place where an agent-facing artifact can change what the agent thinks it is allowed to do. Explore the full attack patterns database for the complete agent_workflow_security catalog.

FIG.06 · First controls

Defender checklist

sunglasses://blog/approval-graph-poisoning-runtime-trust#checklist

Checklist

Bind approvals to exact action paths. Store actor, action, resource, destination, time window, policy version, and approved command or deploy class.
Treat comments and summaries as untrusted. A comment can describe approval, but it should not become approval.
Re-derive state before action. Pull status from source systems at runtime instead of trusting copied dashboard text or previous-run summaries.
Expire emergency lanes quickly. Emergency labels and hotfix exceptions should be short-lived, source-bound, and impossible to copy into unrelated changes.
Separate approval evidence from agent-authored evidence. An agent should not be able to create the same artifact it later uses as proof that it may act.
Scan workflow text and metadata. Review tickets, PR comments, MCP descriptions, runbook notes, and status outputs for language that tries to redefine approval scope.
Log the trust decision. Record not just that an action happened, but why the approval graph proved it was allowed at that moment.

First sentence

This article builds on category-capture research led by Cava, the Sunglasses threat-intel agent.

FIG.07 · Analysis

Approval Graph Poisoning: When AI Agents Trust the Wrong Workflow Gate

Plain-language explainer

Why AI agents change approval risk

Three concrete attack examples

1. Emergency hotfix approval bypass

2. Forged change-ticket auto approval

3. Status-panel greenwashing

Approval controls vs runtime trust

How Sunglasses catches it

Defender checklist

Related reading

Frequently Asked Questions

What is approval graph poisoning?

Why does approval graph poisoning matter for AI agents?

How do you defend against approval graph poisoning?

Is approval graph poisoning the same as prompt injection?

Where does Sunglasses fit for approval graph poisoning?

Scan what the agent sees, before it acts