An operator-grade AI agent hardening manual built around one rule: scan and enforce at the content-ingestion trust boundary before unsafe text becomes action.
Eight chapters. Each maps a threat to a control, a test, and a remediation. Updated as research lands. No sign-up. No paywall.
Status: 1 chapter live, 1 in progress, 6 planned. Roadmap shifts with the research.
The introductory chapter. What an AI agent is, why it can be attacked through content, and the four trust boundaries every deployment must enforce.
Read chapter →The shipping checklist for production AI agents. Threat → control matrix, implementation steps, three real case studies, validation tests. Drops within 24 hours.
The unique threat model of agents that read repos, run commands, and write code. MCP boundary failures, terminal-as-attack-surface, supply-chain trust.
The architecture for scanning untrusted content before it enters an agent's context window. Pattern coverage, false-positive budgets, performance trade-offs.
How agent toolchains get compromised. Skill registries, MCP server poisoning, prompt-injection in dependencies. Detection and isolation patterns.
Where one user's context ends and the next begins. Cross-session leakage, persistent memory poisoning, retrieval-time injection.
The fixture suite. Reusable test cases for prompt injection, exfiltration, tool abuse, and evasion. Pattern-DB-grounded, regression-tested.
What to do when a control fires. Triage, containment, evidence preservation, post-mortem template. The runbook nobody else publishes.
Existing reports and research that map directly into the manual structure.
One email per chapter. No newsletter spam. Unsubscribe in one click.
If your question is not here, message the team via /contact.
An operator-grade hardening manual for AI agents, built around one rule: scan and enforce at the content-ingestion trust boundary before unsafe text becomes action. Each chapter maps a threat to a control, a test, and a remediation. Independent research. Free.
Engineers, security teams, and founders shipping AI agents to production. If you call an LLM with content you did not write, this manual is for you.
OWASP, MITRE, and NIST publish taxonomies and policy frameworks. This manual focuses on operator runbooks. We tell you what to scan, where to gate, how to test, and what to do when a control fires. We are not affiliated with any of those organizations and we encourage reading their materials alongside ours.
Yes. Every chapter is based on our own pattern database, our own scanner output, and publicly available security advisories (with citations). We do not copy proprietary content. Every quantitative claim links to its source. Comparisons to other projects reflect publicly available materials at the time of writing.
No. Sunglasses is independent. No partnership, sponsorship, or endorsement from any of those organizations. References to their work are commentary on publicly available materials.
No. We are independent. But our wedge is the same problem space: AI systems doing security work. Mythos finds vulnerabilities. Sunglasses keeps the AI agent itself from becoming the vulnerability.
Continuously. Every chapter has a last-updated date. New chapters ship as research lands. Pattern updates ship into the open-source scanner.
Yes. The scanner is open source on GitHub. The manual takes pull requests for new chapters, case studies, and pattern fixtures. Reach out via /contact.