An agent does not only read prompts. It reads tickets, pull requests, status checks, labels, logs, approval comments, exception notes, tool responses, and handoff records. Approval graph poisoning is what happens when that surrounding workflow evidence makes the agent believe a dangerous action is approved.

Quick answer: Approval graph poisoning is an AI agent workflow security failure where the evidence around an approval is forged, stale, incomplete, or scoped to the wrong action. The agent sees a green status, a reassuring ticket, a comment that sounds like approval, a copied exception label, or a handoff note that says the human already agreed. Then it uses real tool, repo, deploy, or ticket authority on an action that was never truly approved. Sunglasses v0.2.49 ships 21 new GLS-AW patterns (GLS-AW-127 through GLS-AW-147) in the agent_workflow_security category — including GLS-AW-130 (Date Boundary READY Label Forgery), GLS-AW-131 (Fake Budget Pressure Validation Skip), and GLS-AW-147 (False Done Sentinel Premature Exit) — directly covering approval-gate manipulation at runtime. See the Sunglasses manual for full hardening guidance.

Plain-language explainer

An approval graph is the set of workflow signals an agent uses to decide whether an action is allowed. It can include a Jira ticket, a GitHub review, a CI status check, a Slack thread, a policy exception, a runbook step, a deploy gate, an incident label, a calendar window, a pager handoff, or an MCP tool response that describes current state.

Humans often treat these signals as context. Agents may treat them as instructions or authority. If a ticket says "emergency hotfix approved," if a PR label says "security exception granted," or if a status panel says "all checks waived," the agent may decide that the next action is safe. That decision can be wrong even when every individual system is behaving normally.

Approval graph poisoning is subtle because the attacker does not always need to break the approval system. They can write or alter the text around the system: a comment that implies approval, a copied ticket link, a stale exception note, a fake handoff summary, a status entry copied from another change, or a metadata field that points the agent at the wrong policy state.

The quotable sentence: An approval is not a vibe; it is a bound claim about actor, action, resource, destination, policy, and time.

Why AI agents change approval risk

AI agents collapse reading, interpretation, and action into one workflow path. A human reviewer may notice that a ticket comment is just a comment, not an approval. An agent may summarize that same comment as "approval exists," then pass the summary into a tool call, deploy workflow, or repo action.

This is the important shift for agent workflow security. The risky input is not always a prompt that says "ignore previous instructions." The risky input can be an ordinary workflow artifact that changes the agent's model of permission. When an agent has enough authority to patch code, call tools, create tickets, update status, push branches, rotate config, or trigger deployments, corrupted approval evidence becomes an action-time security problem.

The market already talks about guardrails, access control, least privilege, policy engines, and human-in-the-loop approvals. Those controls matter. But approval graph poisoning sits between them. It asks whether the thing the agent is about to do is still the thing that was approved, under the same evidence, for the same reason, inside the same boundary. Review agent workflow evidence contracts for the structural pattern that binds these guarantees.

Three concrete attack examples

1. Emergency hotfix approval bypass

An attacker frames a normal change as an approved emergency hotfix. The agent sees a ticket title, incident label, and comment thread that imply urgency: "approved for immediate mitigation," "skip normal review," or "production impact confirmed." The real approval may not exist, may apply to a different patch, or may have expired. If the agent only reads the summary, it can push or deploy under the emergency lane.

A runtime-trust check forces the graph to prove the emergency path: the incident identifier, approver identity, approved diff or command class, expiration, affected service, and current action. If any part fails to bind, the agent stops. This maps directly to the Date Boundary READY Label Forgery pattern (GLS-AW-130) — where deadline language is injected to pressure the agent into skipping verification.

2. Forged change-ticket auto approval

A workflow record claims the human already approved the exact action. The agent receives a copied ticket link, a fake approval note, or a status field that says "reviewed by security." The ticket may be real, but the approval may refer to a different resource, a different branch, or a different risk level.

The failure is not that the agent has no approval gate. The failure is that the gate trusts a derived statement instead of source evidence. The safer design ties the approval to the requested tool call, resource, destination, and policy version before action. GLS-AW-131 (Fake Budget Pressure Validation Skip) covers the variant where financial or urgency pressure fabricates a bypass rationale.

3. Status-panel greenwashing

A status source makes the workflow look safer than it is. A dashboard, summary file, or tool response says all checks passed, all risks accepted, or all dependencies verified. The agent uses that green status to continue. But the status may be stale, selectively summarized, copied from a previous run, or generated by an untrusted part of the workflow.

Here the right question is not "was there a status?" It is "which system produced this status, when, from which evidence, for which action, and can the agent independently verify it before using authority?" GLS-AW-147 (False Done Sentinel Premature Exit) catches the pattern where a fabricated completion signal causes the agent to terminate verification early and act as if the workflow completed successfully.

Approval controls vs runtime trust

Approval controls decide who may approve a class of action; runtime trust decides whether this exact action still matches the approved path. You need both layers. Access control without runtime trust lets poisoned context steer allowed authority. Runtime trust without real approval controls becomes a fancy log checker. See CVP for coverage verification against your specific pipeline.

LayerWhat it helps withWhat it can miss
Least privilegeLimits which tools, repos, secrets, and deploy paths the agent can reach.The agent can still misuse allowed authority when poisoned workflow evidence says the action is approved.
Human-in-the-loop approvalAdds friction before sensitive changes.The agent may trust a stale, forged, or mis-scoped approval artifact instead of the real approval source.
Policy engineChecks requested actions against rules.The request can be framed with poisoned labels, summaries, exception notes, or status claims.
Runtime trustRe-derives whether the live action matches the live evidence.Needs good inputs from identity, policy, logging, and workflow systems to make the right call.

The practical security move is to treat approval artifacts as untrusted until they are re-bound to the action. A ticket is not enough. A label is not enough. A comment is not enough. A status check is not enough. The agent should be able to answer: "what exact approval evidence lets me do this exact thing right now?" The FAQ covers how Sunglasses handles these evidence-chain questions at runtime.

How Sunglasses catches it

Sunglasses is built for the text-and-metadata layer where workflow authority gets quietly redefined. Approval graph poisoning often appears in natural language and structured fields before it appears as a blocked tool call. That is where scanner coverage matters: tickets, runbook snippets, MCP tool descriptions, status summaries, PR comments, exception text, and handoff notes can all reshape the agent's plan.

For this pattern family, Sunglasses looks for language that tries to launder authority through workflow artifacts: emergency approval claims, fake reviewer statements, policy-waiver phrasing, status-greenwashing, ticket-scope mismatch, auto-approval instructions, and approval summaries that ask the agent to trust a derived claim instead of verifying source evidence.

The v0.2.49 release ships 21 new GLS-AW patterns (GLS-AW-127 through GLS-AW-147) in the agent_workflow_security category. Approval-gate-relevant highlights include:

The product point is narrow and honest. Sunglasses does not replace identity, CI policy, or human review. It gives teams a runtime-trust signal at the place where an agent-facing artifact can change what the agent thinks it is allowed to do. Explore the full attack patterns database for the complete agent_workflow_security catalog.

Defender checklist