What is multi-agent fact-checking?

Multi-agent fact-checking is the practice of using several AI agents in parallel to audit a body of claims before publication. Each agent scans a disjoint slice of the corpus, flags potential issues (hallucinated citations, duplicates, unfalsifiable statements), and returns a structured report. The lead agent synthesizes findings. It catches more than a single-agent audit but introduces its own failure mode: any individual agent can hallucinate a false flag.

What is an absence-claim and why is it risky?

An absence-claim is a statement that something does not exist — for example, claiming a CVE is hallucinated or a source doesn't match. Absence-claims feel authoritative but are harder to verify than existence-claims. To falsify an existence-claim you need one counter-example. To falsify an absence-claim you need an exhaustive search. Our audit agent made an absence-claim without actually performing the HTTP lookup that would have verified it.

What is GHSA and CVE?

CVE (Common Vulnerabilities and Exposures) is a public identifier assigned by MITRE for a disclosed security flaw. GHSA (GitHub Security Advisory) is GitHub's own advisory database, which publishes CVEs plus additional advisories specific to open-source ecosystems. GHSA-pj2r-f9mw-vrcq is a real, live GitHub Security Advisory mapping to CVE-2026-40159.

How can I verify a GHSA myself?

Run curl -sLI https://github.com/advisories/GHSA- and check for HTTP 200. Or run gh api /advisories/GHSA- if you have the GitHub CLI. Either returns a live response if the advisory exists. A 404 or redirect to the main advisories page means the ID is wrong or doesn't exist. Pattern-matching the format of the ID is not verification.

Why did the research agent catch what the audit agent missed?

The research agent (Cava) had a confidence-flagging behavior trained in: before deleting her own work on upstream feedback, she verifies the highest-risk claim independently. In this case she visited the advisory URL directly, captured the live title, and held her edits until Boss confirmed. The audit agent skipped the HTTP lookup and relied on a format heuristic ('all-caps GHSA looks weird') which is not verification.

The Audit That Almost Deleted a Real CVE

sunglasses://blog/audit-that-almost-deleted-a-real-cve

TL;DR

Before publishing the MCP Attack Atlas, we ran a 5-agent fact-check audit on 169 pattern files.
One audit agent flagged GHSA-pj2r-f9mw-vrcq / CVE-2026-40159 as "hallucinated, doesn't exist" based on a format heuristic.
Our research agent pushed back, visited the advisory URL, confirmed HTTP 200 + live title, held her edits.
We curled the URL ourselves. The advisory is 100% real — PraisonAI MCP subprocess env exposure.
Lesson: absence-claims need the same proof standard as existence-claims. Multi-agent audits can hallucinate too.

FIG.01 · Analysis

What happened?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

Context

We shipped the MCP Attack Atlas today — an open catalogue of 40+ attack patterns against AI agents using the Model Context Protocol. Before publishing, we ran a multi-agent fact-check audit against the 169-file internal research library that fed the Atlas. The audit's job: catch hallucinated citations, duplicates, unfalsifiable fixtures, and benchmark-theater claims before they became part of our public brand.

The point

One of the audit agents flagged this citation in two pattern files:

Specimen

# Claimed by audit agent:
GHSA-PJ2R-F9MW-VRCQ
"Does not exist in GitHub Advisory Database.
 Format is suspicious (all-caps).
 Recommend retraction."

Detail

Sounded decisive. We routed the finding to our research agent (Cava), the author of those pattern files, with a formal retraction request.

FIG.02 · Field notes

Why did the research agent refuse the retraction?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

Context

Cava has a Director-level behavior trained in: before deleting her own work based on upstream feedback, she verifies the highest-risk claim independently. Instead of complying, she opened the URL:

Specimen

$ open https://github.com/advisories/GHSA-pj2r-f9mw-vrcq
Title observed: "PraisonAI Vulnerable to Sensitive Environment
Variable Exposure via Untrusted MCP Subprocess Execution ·
CVE-2026-40159 · GitHub Advisory Database · GitHub"

The shift

She wrote back with the captured evidence and held her edits until Boss confirmed. We then verified independently:

Specimen

$ curl -sLI https://github.com/advisories/GHSA-pj2r-f9mw-vrcq \
    | head -1
HTTP/2 200
$ curl -sL https://github.com/advisories/GHSA-pj2r-f9mw-vrcq \
    | grep -oE '<title>[^<]+'
<title>PraisonAI Vulnerable to Sensitive Environment Variable
  Exposure via Untrusted MCP Subprocess Execution · CVE-2026-40159 ·
  GitHub Advisory Database · GitHub

Evidence

The advisory is real. Live. Indexed. The audit agent had pattern-matched a format heuristic ("all-caps IDs look fake") without doing the one-line HTTP check that would have falsified its own claim in a tenth of a second.

FIG.03 · Explainer

What is an absence-claim and why is it the dangerous kind?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

DEFINITION · ABSENCE-CLAIM
A statement that something does not exist. Examples: "this CVE isn't real", "this paper doesn't exist", "this vendor never said that". Absence-claims are harder to verify than existence-claims: one counter-example falsifies an existence-claim, but only an exhaustive search falsifies an absence-claim.

Context

Our audit agent took an absence shortcut: it saw an unfamiliar-looking ID format and concluded the target didn't exist. That's cheaper than verifying. It's also wrong.

The shift

In security research, absence-claims are how you accidentally destroy real evidence. If we had told Cava to delete the citation, we would have:

Signals

Stripped a valid CVE reference from our pattern library
Weakened two detection patterns that map directly to that vulnerability
Signaled to our own team that agent hallucinations are authoritative
Broken the brand promise that the Atlas is evidence-backed

FIG.04 · First controls

How do we prevent this in future audits?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

First sentence

One new rule, now part of our public common-mistakes log:

Absence-claims require the same proof standard as existence-claims. If a fact-check agent says "this does not exist", verify by HTTP lookup (or database query, or public record search) before acting on it. Pattern-matching is not verification. — Common-Mistakes #46, April 14, 2026

The controls

Operationally that means one line at the top of every audit-agent prompt:

Specimen

# Audit-agent rule, added post-incident:
Before flagging anything as "does not exist" or
"hallucinated", run a live HTTP/database check.
Record the check. Attach the result to the flag.
No absence-claim without a verification attempt.

FIG.05 · Analysis

Why did the research agent catch what the audit missed?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

Context

Three behaviors Cava already had that saved us:

Checklist

Verify the highest-risk claim first. Of five issues in the audit feedback, she picked the one that most threatened her work and checked that one directly, not the whole list.
Push back with evidence, not opinion. Her reply captured the live advisory title verbatim instead of saying "I disagree".
Hold edits until Boss confirms. She didn't half-revert or wait silently — she wrote an explicit "I'm not editing until you confirm this" note and continued her other cycles.

The shift

That's the behavior pattern we want across the whole agent team. It scales better than trying to make every audit infallible. Infallible audits aren't a thing. Honest agents are.

FIG.06 · Coverage

What went into the public record?

sunglasses://blog/audit-that-almost-deleted-a-real-cve

Signals

The Cava page links to her Director-level framework, including the confidence-flagging behavior that caught this.
Our common-mistakes.md entry #46 documents the incident by name, date, and fix — searchable for future team members.
A formal retraction was sent back to Cava ("Issue 1 retracted, you were right") with specific commendation for the pushback behavior.
The Atlas now carries a footnote in the "What's next" section describing the incident: "One claimed citation was initially flagged as hallucinated but turned out to be real. The audit agent's retraction is logged publicly as a matter of process. Honest > clean."

The wedge

We kept the CVE citation in both pattern files. They are now stronger for having been challenged, not weaker.

See the patterns this CVE maps to →

Open the MCP Attack Atlas View the advisory on GitHub

FIG.07 · Analysis

The meta-point

sunglasses://blog/audit-that-almost-deleted-a-real-cve

Context

Sunglasses' whole position is honest vs benchmark theater. We apply that rule to competitors, and we just applied it to ourselves. A multi-agent audit is a useful tool, not a truth oracle. The research agents who actually produced the patterns were better judges of what was real than the separate agents we built to audit them — because they had done the verification work the audit agents skipped.

The point

If your team uses AI agents in security research pipelines, assume every auditor hallucinates sometimes. Build for challenge, not just for catch.

The Audit That Almost Deleted a Real CVE

What happened?

Why did the research agent refuse the retraction?

What is an absence-claim and why is it the dangerous kind?

How do we prevent this in future audits?

Why did the research agent catch what the audit missed?

What went into the public record?

The meta-point

Frequently Asked Questions

What is multi-agent fact-checking?

What is an absence-claim and why is it risky?

What is GHSA and CVE?

How can I verify a GHSA myself?

Why did the research agent catch what the audit agent missed?

Scan what the agent sees, before it acts