What is the main difference between Sunglasses and Promptfoo?

The categorical difference is problem layer, not feature overlap. Promptfoo is a pre-deploy LLM evaluation framework: it runs red-team prompts, scores model responses, catches regressions in CI, and helps you compare models before you ship. Sunglasses is a runtime ingestion filter: it scans every input an AI agent processes in production and blocks or flags adversarial content before the agent acts on it. Both are open-source and MIT-licensed. They operate at different points in the AI pipeline and solve different problems. Many teams use both.

Is Promptfoo open source?

Yes. Promptfoo is open source and available at github.com/promptfoo/promptfoo under the MIT license. Sunglasses is also fully MIT-licensed and open source at github.com/sunglasses-dev/sunglasses. Both tools are inspectable, forkable, and free to use commercially — license is not a differentiator between them.

What are the best Promptfoo alternatives for runtime detection?

If your goal is runtime detection during agent operation (not pre-deploy evaluation), Promptfoo is not primarily built for that role. Runtime-focused alternatives include Sunglasses (local-first Python library, 444 patterns, 23 languages, MIT), LLM Guard (open-source runtime filter), Lakera Guard (hosted commercial guardrail), Rebuff (heuristic plus LLM hybrid), and Vigil (canary-style detection). Sunglasses is specifically optimized as a local ingestion-time filter that runs at every prompt boundary in production.

Can I use Promptfoo and Sunglasses together?

Yes, and this is often the right architecture. Promptfoo catches safety regressions before you ship — red-teaming your model or pipeline in CI catches drift early. Sunglasses catches adversarial inputs that reach your production system at runtime — inputs you did not anticipate during pre-deploy testing. Running both gives you defense at two different layers: pre-deploy coverage with Promptfoo and runtime coverage with Sunglasses.

Does Promptfoo provide runtime protection for AI agents in production?

Promptfoo's primary surface is an evaluation and red-team framework that runs in your development or CI environment. Promptfoo has also published a guardrail/firewall product line — verify the current runtime feature set at promptfoo.dev. Sunglasses is built specifically for the runtime ingestion boundary: an open-source Python library that runs inside your production agent pipeline, scanning every input before the model or tool chain processes it, with a decision latency around 0.261ms.

How does Sunglasses compare to Garak and Promptfoo as eval tools?

Garak and Promptfoo are both evaluation and red-team tools — they probe AI models to assess safety posture before or outside of production. Sunglasses is not in that category. Sunglasses is a production runtime filter. The three are complementary: use Garak and Promptfoo to evaluate models and catch regressions before ship; use Sunglasses to filter adversarial inputs in production. Each tool addresses a different point in the AI security lifecycle.

Sunglasses vs Promptfoo — Runtime Detection vs Pre-Deploy Evaluation

What each tool is

Sunglasses

Sunglasses is a free, MIT-licensed Python library that scans every input an AI agent processes — text, documents, code, MCP tool descriptions, READMEs, retrieval results, and agent-to-agent messages — before the agent acts on it. It runs a 3-stage pipeline: normalize the input across 17 deterministic techniques (URL decode, Unicode normalization, homoglyph mapping, base64 decode, and 13 others), detect against 444 patterns across 54 attack categories, then decide block / review / allow. The decision happens in an average of 0.261ms with no network call.

Sunglasses is built for production runtime. It lives in your agent pipeline and fires on every input. There is no API key, no cloud dependency, and no outbound telemetry by default. Every pattern is publicly inspectable at github.com/sunglasses-dev/sunglasses. Install with pip install sunglasses.

Promptfoo

Promptfoo is an open-source LLM evaluation and red-team framework. Its core use case is pre-deploy testing: you define test cases and assertions, run them against one or more models or configurations, and get scored results. Teams use Promptfoo to catch safety regressions in CI, compare two models on the same prompt set, and structure red-team exercises before shipping AI features. It is TypeScript-based and runs primarily as a CLI or CI job, not as an in-process runtime guard. For current features and documentation, see promptfoo.dev directly.

Honest framing: This page is written by the Sunglasses team. We have represented Promptfoo accurately based on its public documentation and GitHub repository. Where Promptfoo's specifics are not clearly documented or may have changed since this writing, we say so explicitly rather than guess. Verify current Promptfoo capabilities at promptfoo.dev.

Side-by-side comparison

Dimension	Sunglasses	Promptfoo
License	MIT (free forever, commercial use OK)	MIT (open source)
Primary category	Runtime ingestion filter / AI agent security scanner	LLM evaluation and red-team framework
When it runs	Production runtime — fires on every agent input	Pre-deploy — CI pipeline, red-team exercises, evaluation runs
Primary language	Python	TypeScript / JavaScript
Install path	`pip install sunglasses`	`npm install -g promptfoo`
Deployment model	Local Python library — runs in your process, zero network call	CLI / CI job — typically runs as a test harness outside production
Detection patterns	444 patterns across 54 attack categories (v0.2.27, Apr 30 2026)	Pluggable graders and red-team strategies — user-defined test cases
Language coverage	23 languages (multilingual detection)	Not specifically documented as a multilingual detection layer
Normalization pipeline	17 normalization techniques applied before detection (base64, homoglyph, URL decode, etc.)	Not a normalization-first detection system — evaluates model responses, not input obfuscation
Output format	block / review / allow decision per input; SARIF 2.1.0 report	Scored evaluation report; HTML, JSON, CSV outputs
Use case fit	Ingestion boundary guard: stop adversarial content reaching your agent at runtime	Model/pipeline evaluation: score and compare AI behavior before or outside production
Data exposure	None by default — inputs never leave your process	Runs locally by default; may call external model APIs depending on configuration
Scan speed	0.261ms avg, ~3,830 scans/sec (single thread, local)	Designed for test runs, not production latency-sensitive paths
SARIF output	Yes (SARIF 2.1.0)	Not documented as a SARIF output format as of this writing
CVP / third-party evaluation	Anthropic Cyber Verification Program approved (org ID d4b32d1d-…). 7 published reports at /cvp.	No equivalent published CVP-style authorization known as of this writing

Where the table reads "not documented" or similar, that reflects publicly available information at time of writing — not an inference that the capability does not exist. Promptfoo's feature set evolves; verify current capabilities at promptfoo.dev.

When to pick Sunglasses

You need runtime protection in production. Promptfoo is a pre-deploy evaluation tool. If your requirement is filtering adversarial inputs at agent runtime — before your model reads them — Sunglasses is the right tool for that layer. See the architecture page for how the ingestion pipeline integrates.
You are building in Python. Sunglasses is a native Python library. If your agent stack is LangChain, CrewAI, AutoGen, or any Python-based framework, call engine.scan(text) before any model or tool invocation. One import, one function call.
You need multilingual coverage at ingestion time. Sunglasses detects prompt injection and related attacks across 23 languages, because attackers routinely translate payloads to bypass English-only filters. This happens at runtime, on every input, in under a millisecond.
You need audit-grade pattern transparency. Every Sunglasses detection pattern is open source and inspectable. If your security team needs to know exactly what fired and why — the logic is on GitHub. See the Open Source AI Agent Security Scanner page for the full catalog context.
You want zero subscription cost and no managed dependency. MIT license, local process, no network call. This matters for projects where a per-call cost model or an external dependency at the ingestion boundary is architecturally unacceptable.
Your agent processes MCP tool descriptions, READMEs, or retrieval content. Sunglasses scans all text surfaces an agent ingests — not just user messages. MCP tool poisoning, README poisoning, and retrieval injection are first-class attack categories. See the MCP Attack Atlas and the security manual for integration patterns.

When Promptfoo wins

You need to evaluate model safety before shipping. Promptfoo is purpose-built for running structured test cases against AI models. If you want to benchmark two models on the same adversarial prompt set, catch safety regressions in CI, or score model behavior systematically before a feature ships — Promptfoo is the right tool for that job.
Your team is TypeScript-first. Promptfoo is TypeScript-native. If your evaluation infrastructure is already JS/TS-based, Promptfoo integrates naturally into that stack without requiring a Python environment.
You want structured red-team exercises with scoring. Promptfoo's evaluation framework lets you define graders, assertions, and scoring criteria for red-team runs. This is different from runtime filtering — it is about measuring model posture against defined threat scenarios.
You need model comparison and regression tracking over time. Promptfoo supports comparing multiple models on the same prompt set and tracking how scores change across versions. This is a pre-deploy quality control loop, not an ingestion-time defense.
You are building a CI safety gate before production deploys. Integrating Promptfoo into your CI pipeline catches safety and quality regressions at the PR or deploy boundary. This is a different architectural position from Sunglasses, which catches adversarial inputs in the live system.

Why you might use both

The most complete AI agent security architecture addresses multiple layers. Promptfoo and Sunglasses are not alternatives — they are tools for different points in the same pipeline.

Promptfoo in CI, Sunglasses at runtime. Promptfoo catches drift before you ship — red-teaming your model or pipeline as part of the release process ensures known attack families are handled correctly before the new version goes live. Sunglasses catches what reaches production — inputs that did not exist when you ran your eval, zero-day variants, and multilingual obfuscations you could not predict in pre-deploy testing.
Complementary coverage by design. Pre-deploy evaluation tells you how your model behaves under known adversarial conditions. Runtime detection tells you what adversarial content is actually reaching your production system. Both signals are useful. Neither replaces the other.
No conflict in the stack. Promptfoo runs in your CI environment as a test harness. Sunglasses runs in your production Python process as an ingestion filter. They do not compete for the same position in the architecture. Adding one does not remove the value of the other.

As documented in the Sunglasses FAQ, "Sunglasses is strongest as a local ingestion boundary layer" while "tools like Garak focus on probing." The same logic applies here: Promptfoo focuses on pre-deploy evaluation; Sunglasses focuses on runtime ingestion filtering. Use both for layered defense.

For MCP-specific attack patterns and why ingestion-boundary filtering matters even when your model passes every eval, the MCP Attack Atlas covers the threat surface in detail. For published evaluation results under Anthropic's Cyber Verification Program, see the CVP page and the reports index.

For more on how the Sunglasses scanner fits into a broader AI security posture, the A2A trust boundary blog post and the persona-scoped access analysis cover real attack patterns the runtime layer is designed to catch — patterns that may not surface in pre-deploy evaluation because they depend on attacker-controlled runtime inputs.

Sunglasses vs Promptfoo — Runtime Detection vs Pre-Deploy Evaluation

What each tool is

Sunglasses

Promptfoo

Side-by-side comparison

When to pick Sunglasses

When Promptfoo wins

Why you might use both

Frequently Asked Questions

JACK

Sunglasses vs Promptfoo — Runtime Detection vs Pre-Deploy Evaluation

What each tool is

Sunglasses

Promptfoo

Side-by-side comparison

When to pick Sunglasses

When Promptfoo wins

Why you might use both

Frequently Asked Questions

JACK

Related reading

Your call.