How our 35 threat categories and 248 patterns map to the OWASP LLM Top 10 (2025 edition). Honest coverage: 7 risks covered, 3 gaps named.
"A Prompt Injection Vulnerability occurs when user prompts alter the LLM's behavior or output in unintended ways."
Our core detection surface. Direct, indirect, hidden-instruction, and obfuscated prompt injection all map here.
"Sensitive information can affect both the LLM and its application context."
Covers both extraction attempts (asking the model to reveal its prompt/config) and detection of secrets leaking through inputs/outputs.
Source: genai.owasp.org/llmrisk/llm022025-sensitive-information-disclosure
"LLM supply chains are susceptible to various vulnerabilities, which can affect the integrity of training data, models, and deployment platforms."
We detect runtime indicators of supply-chain compromise in tool metadata and MCP server descriptions. Training-data supply chain is outside our scope (it's pre-training).
"Data poisoning occurs when pre-training, fine-tuning, or embedding data is manipulated to introduce vulnerabilities, backdoors, or biases."
We detect runtime-side memory/context poisoning (after a model is deployed). Pre-training and fine-tuning data poisoning is pre-deployment and outside our surface — that's model-scanning territory (ProtectAI Guardian, HiddenLayer).
Source: genai.owasp.org/llmrisk/llm042025-data-and-model-poisoning
"Improper Output Handling refers specifically to insufficient validation, sanitization, and handling of the outputs generated by large language models before they are passed downstream to other components and systems."
We detect dangerous payloads that downstream systems would execute: shell injection, path traversal, SSRF-style URLs, deserialization, C2 indicators, DNS tunneling, exfil patterns.
Source: genai.owasp.org/llmrisk/llm052025-improper-output-handling
"An LLM-based system is often granted a degree of agency by its developer — the ability to call functions or interface with other systems via extensions to undertake actions in response to a prompt."
We catch agent-workflow violations, privilege escalation attempts, sandbox-escape patterns, and MCP/tool-metadata poisoning that triggers unintended agent actions.
"The system prompt leakage vulnerability in LLMs refers to the risk that the system prompts or instructions used to steer the behavior of the model can also contain sensitive information that was not intended to be discovered."
We detect extraction-probe patterns (direct and indirect attempts to elicit the system prompt).
Source: genai.owasp.org/llmrisk/llm072025-system-prompt-leakage
"Vectors and embeddings vulnerabilities present significant security risks in systems utilizing Retrieval Augmented Generation (RAG) with Large Language Models (LLMs)."
Sunglasses does not currently ship specific detection for RAG/vector store attacks. This includes embedding poisoning, adversarial embeddings, and vector-store access control bypass. Planned as part of output scanning work in v0.3.0 (retrieval content is exactly the kind of tool-output we want to inspect).
For now: see Lakera Guard (runtime), Invariant/Snyk (MCP), Pillar Security (RAG inventory).
Source: genai.owasp.org/llmrisk/llm082025-vector-and-embedding-weaknesses
"Misinformation from LLMs poses a core vulnerability for applications relying on these models."
Sunglasses does not detect model hallucination or factual misinformation. That's a model-behavior problem, not a pattern-match problem. See Giskard, Patronus AI, Arthur Shield for hallucination detection.
"Unbounded Consumption refers to the process where a Large Language Model (LLM) generates outputs based on input queries or prompts."
Sunglasses does not handle denial-of-service, cost-harvesting, or resource-exhaustion attacks. That's rate-limiting and infrastructure territory (Cloudflare AI Gateway, Akamai, usage-capped proxies).
Source: genai.owasp.org/llmrisk/llm102025-unbounded-consumption
| OWASP Risk | Coverage | Sunglasses Categories |
|---|---|---|
| LLM01 Prompt Injection | COVERED | 13 categories (full prompt injection surface) |
| LLM02 Sensitive Information Disclosure | COVERED | prompt_extraction, prompt_leak, secret_detection, exfiltration |
| LLM03 Supply Chain | COVERED | supply_chain, mcp_threat, tool_poisoning (runtime) |
| LLM04 Data and Model Poisoning | PARTIAL | memory_poisoning (runtime only) |
| LLM05 Improper Output Handling | COVERED | 7 categories (shell/path/ssrf/deser/exfil/C2/DNS) |
| LLM06 Excessive Agency | COVERED | 7 categories (agent workflow + privilege) |
| LLM07 System Prompt Leakage | COVERED | prompt_extraction, prompt_leak |
| LLM08 Vector/Embedding Weaknesses | GAP | Planned for v0.3.0 output scanning |
| LLM09 Misinformation | GAP | Out of scope (behavioral, not pattern-based) |
| LLM10 Unbounded Consumption | GAP | Out of scope (infrastructure-level) |
Every finding Sunglasses emits includes the triggering pattern ID, severity, category, and matched text. When you pipe through --output sarif, each finding becomes a SARIF result with a security-severity score and a category tag that can be cross-referenced back to the OWASP risk above.