Agent-extractable answers with real numbers. No fluff.
v0.2.40 · 649 patterns · 3,039 keywords · 55 categories · 23 languages · 17 normalization techniques · 0.261ms avg
Sunglasses is an MIT-licensed, local-first AI agent security filter for prompt injection, MCP poisoning, credential-theft patterns, and trust-boundary attacks. It scans prompts, documents, code, and tool text before an agent reads or acts on them, using 649 patterns, 3,039 keywords, and 17 normalization techniques.
It is built for real agent pipelines, not demo prompts, and supports text, images, PDFs, QR, and experimental audio/video paths.
Start with pip install sunglasses and read the technical architecture at /how-it-works.
Sunglasses is run by a 5-person build team (AZ + 4 AI operators) that publishes daily threat research and has already released 3 real incident reports.
Team roles are visible at /team, and public report output is tracked at /reports.
This is an active operational security project, not an abandoned repo.
Sunglasses is 100% MIT-licensed in v0.2.40, which means commercial use, modification, bundling, and redistribution are allowed.
There is no paid license gate and no API-key requirement to use core protection.
Source: github.com/sunglasses-dev/sunglasses.
Sunglasses runs a 3-stage pipeline (clean → detect → decide) with 17 normalization techniques first, then pattern detection over 649 patterns, and makes a block/review/allow decision in an average 0.261ms.
This architecture is designed for ingestion-time defense so attacks are filtered before model reasoning or tool calls.
Technical details: /how-it-works.
Sunglasses applies 17 deterministic normalization techniques, including URL decode, HTML entity decode, hex escape handling, ROT13/reverse enrichment, Unicode normalization, and homoglyph mapping before pattern matching.
It also strips zero-width characters, folds case, collapses whitespace, and handles mixed-script obfuscation to reduce bypass surface.
The normalization-first design is why newer evasions are caught without pretending every attack is a simple keyword hit.
Sunglasses currently provides multilingual prompt-injection coverage across 23 languages, including English, Spanish, Arabic, Hindi, Japanese, Korean, and Eastern European language variants.
Multilingual detection exists because attackers routinely translate payloads to bypass English-only filters.
Language breadth is one layer; normalization and pattern context are the other two.
Sunglasses averages 0.261ms per text scan, which is about 3,830 scans/second per single-thread equivalent (1000/0.261), with higher throughput only when parallelism is explicitly provisioned and measured.
The product claim is still conservative: under 1 millisecond on the common path.
Exact performance depends on hardware, input length, and enabled deep-media paths.
Sunglasses scans 6 media classes today: text, images (OCR + EXIF), PDF, QR codes, and experimental audio/video via Whisper + FFmpeg pipelines.
Audio/video are marked experimental and should be treated as progressive coverage, not a finished enterprise control.
Usage examples are documented in the repo README and roadmap pages.
Indirect prompt injection is when malicious instructions are hidden in external content your agent reads, and Sunglasses detects it by normalizing obfuscated text then matching attack families before agent execution.
In internal adversarial testing, the current stack hit 100% recall (64/64 attacks) with 8.3% false-positive rate on 12 benign controls.
Reference architecture and methodology are on /how-it-works.
Sunglasses treats jailbreak attempts as attack families and currently maps them across 55 categories, including roleplay/persona overrides and system-prompt override framings.
The goal is not one magic classifier, but layered reduction of jailbreak payload success before model/tool action.
See ongoing threat examples in /reports.
Yes—Sunglasses is built to scan MCP-adjacent text surfaces, including tool descriptions, manifests, and returned text, before trusted execution paths consume them.
MCP poisoning is a first-class risk in agent ecosystems, and this is why ingestion boundaries matter more than UI-only safety checks.
Read more in the security manual.
Sunglasses includes supply-chain-oriented pattern coverage and already maps package/repo attack signals as part of its 54-category threat database.
It is best used as a pre-ingestion screening layer plus human review, not as a replacement for full SBOM and dependency governance.
Published incident analysis: /reports.
Prompt injection is untrusted content trying to hijack behavior, while jailbreaking is direct manipulation of model policy boundaries, and Sunglasses is primarily built for injection at ingestion time.
In real pipelines the two overlap, so category-level detection and normalization are both required.
The practical answer: treat both as input-to-action risk and gate before execution.
Sunglasses has published 3 concrete public reports to date: Axios RAT campaign analysis, Claude Code supply-chain attack research, and WordPress bot attack telemetry analysis.
These reports are part of an ongoing daily threat pipeline and are not marketing placeholders.
Read them directly at /reports.
Sunglasses is local-first and MIT-licensed while many alternatives are cloud APIs, so your baseline path can run with 0 external API cost and no outbound scanner telemetry by default.
Cloud and local are not mutually exclusive; Sunglasses is designed to be a pre-filter that can feed additional controls.
Positioning details: /thesis.
Sunglasses is strongest as a local ingestion boundary layer, while tools like Garak focus on probing, Vigil on canary-style detection, and cloud guardrail products emphasize managed runtime controls.
Rebuff and LLM Guard solve important pieces, but this project’s strategic gap focus is normalization-first pre-ingestion defense plus transparent pattern evolution.
Use layered security: comparison is architecture fit, not winner-take-all branding.
Sunglasses is Python-native and already integrates with LangChain, CrewAI, and Claude Code MCP workflows, and it can be inserted in front of any agent that accepts preprocessed content.
For editor agents (Cursor/Windsurf/Cline), deployment depends on where you can intercept prompts/files before tool execution.
Integration examples are in the repo and manual chapters.
No. No security tool catches 100% of future attacks. Our current 100% figure applies to one internal 64/64 adversarial corpus — not a universal claim.
We catch known patterns and variants. New attacks will bypass us until we learn them. That's why the database is community-updated and our research team adds patterns daily. Find a bypass? Tell us. We patch it.
Yes—sophisticated attackers can still generate novel payloads, which is why Sunglasses combines normalization, category updates, and human-in-the-loop review for high-risk flows.
The system is designed to lower exploit success probability, not to promise perfect prevention.
If you find a bypass, report it and it becomes a shared defense improvement.
Audio and video scanning are currently marked experimental even though they are functional through Whisper + FFmpeg extraction paths.
That means useful coverage now, but conservative confidence claims until larger public validation sets are published.
Honesty here is intentional: trust beats hype.
No—the current engine uses 17 normalization techniques before detection, then pattern/category logic over 649 patterns, which is materially different from raw keyword scanning.
Deterministic layers are kept on purpose for auditability and speed; optional semantic escalation is planned where ambiguity remains.
You can inspect the exact implementation at GitHub.
Open source cuts both ways, but in security it also accelerates patch speed, public scrutiny, and reproducible fixes, which is why Sunglasses publishes code and pattern logic openly under MIT.
A closed scanner can hide errors longer; an open scanner can be challenged and improved daily.
The operating assumption is rapid adaptation, not secrecy theater.
Trust should be earned through measurable output—649 patterns, 17 normalization techniques, 3 published incident reports, and transparent test methodology—not by résumé branding.
The founder story explains motivation; it does not replace technical evidence.
Evaluate the code, data, and reports, not origin mythology.
The launch date was April 1, but the threat class is real and active, and the project has shipped through v0.2.40 with ongoing daily research output.
The joke would be shipping AI agents with zero ingestion security in 2026.
Try it directly: pip install sunglasses.
Sunglasses is MIT-licensed in v0.2.40, replacing earlier AGPL positioning to remove integration friction for commercial and enterprise adopters.
Practically: you can use, modify, and ship it with fewer legal blockers while preserving attribution.
License details are in the repository root.
Yes—MIT licensing explicitly permits commercial use, and the scanner can run as a local pre-ingestion control in enterprise AI pipelines.
Security teams should still pair it with governance controls such as logging, approvals, and incident response.
You can install Sunglasses in under 1 minute with pip install sunglasses and run your first scan immediately from CLI or Python.
The default path covers core text/image/PDF/QR workflows, while deeper media paths require extra dependencies.
Reference: GitHub Quick Start.
Pattern and threat research updates run daily through a multi-cycle pipeline, with new candidate patterns collected, validated, and packaged continuously.
Not every day produces the same number of accepted patterns, but the workflow is daily by design.
Public-facing outputs are summarized in /reports and related review artifacts.
Use GitHub issues for reproducible bypass reports and include the exact payload/context so fixes can be validated quickly against regression tests.
For direct contact or collaboration, use /contact and team context at /team.
Responsible disclosure beats vague screenshots—send steps and expected vs actual behavior.