Today is Day 1. Anthropic launched Claude Opus 4.7, shipped automatic Opus 4.7 cybersecurity safeguards, tied the release directly to Project Glasswing, and opened the new Cyber Verification Program for legitimate security researchers who need fuller access for defensive work. That is a real shift — and exactly why the open-source side of AI agent security matters more, not less.
For months, AI agent security could still be treated like a niche concern: a red-team topic, a prompt-injection edge case, a problem for frontier labs and Fortune 100 security teams. Today Anthropic made it public product policy. Their own words are unambiguous: Opus 4.7 ships with safeguards that "automatically detect and block requests that indicate prohibited or high-risk cybersecurity uses."
That means the market has changed. AI agent security is no longer a future category. It is a live operating constraint.
What Anthropic changed today
Anthropic's launch is important for three reasons.
First, Opus 4.7 is not just a model update. It is their first public proving ground for cyber controls after the Claude Mythos Preview and Project Glasswing announcements. Anthropic explicitly said it would keep Mythos limited, test new cyber safeguards on a less capable model first, and use what it learns to work toward safer deployment of Mythos-class systems.
Second, they introduced a formal split between general access and verified security work. If you are blocked while doing legitimate vulnerability research, penetration testing, or red-teaming, Anthropic now wants you in a review process: the Cyber Verification Program.
Third, they pushed cybersecurity into the center of the AI product conversation. Not safety in the abstract. Not policy PDF language. Actual product behavior, with actual blocked requests.
That is a big deal.
Project Glasswing is the enterprise move. Sunglasses is the open runtime layer.
Project Glasswing is Anthropic's defensive coalition play. Their launch group includes 12 major partners and institutions: Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks.
That is the top of the market.
Sunglasses is not that.
We are the open, local, runtime-side layer for everyone else: individual builders, security researchers, small teams, indie operators, and anyone shipping agents without an enterprise procurement cycle.
That distinction matters.
Anthropic can build provider-side controls into Claude. That is useful. But provider-side controls are not the whole problem. They do not eliminate AI prompt injection hidden in documents, malicious tool instructions, poisoned MCP descriptions, credential exfiltration embedded in normal-looking content, or unsafe action chains across mixed-model agent systems.
This is the part we keep repeating because it is still true: Runtime Governance Is Not Enough. If your model refuses a bad request but your agent still ingests untrusted content, follows poisoned instructions, or routes unsafe tool calls, you still have an agent security problem.
The open-source side is not hypothetical
We are not writing from theory here.
Sunglasses v0.2.14 just shipped to PyPI today — timed with the Opus 4.7 launch — adding 5 new attack categories specifically relevant to the agent stack: tool metadata smuggling, memory eviction-rehydration chains, multi-stage encoding, tool output poisoning, and provenance chain fracture. Our detection layer now spans 253 patterns, 1,475 keywords, and 40 attack categories. We have already published real public artifacts, including our Axios RAT detection report, alongside broader runtime security work around prompt injection, tool poisoning, MCP abuse, and agent trust-boundary failures.
That is the lane.
Not "we built the most powerful cyber model."
Not "we can out-offense the labs."
Our job is to make defensive AI agent security practical, inspectable, and available to the people who are actually wiring these systems together in the wild.
The XBOW angle
XBOW published an offensive Opus 4.7 workflow today. That is newsworthy, and honestly, it reinforces the same point from a different direction: frontier models are now operational enough that security workflows around them are becoming real, fast.
Our position is different.
Sunglasses is the defensive runtime layer for everyone who is not trying to build the most aggressive offensive workflow possible. We care about what happens before an agent acts: what it read, what it trusted, what hidden instructions got through, what tool descriptions were poisoned, what policies were bypassed, and what evidence you have afterward.
That is not a knock on offensive research. It is a distinction in operating model.
We filed for the Cyber Verification Program on Day 1
We did not wait around for the discourse to settle.
We filed for Anthropic's Cyber Verification Program today because this is exactly the kind of transition Sunglasses was built for: provider safeguards getting stricter, legitimate defensive work needing clearer boundaries, and security teams needing evidence about where model-side controls stop and runtime-side controls still matter.
If we get in, we will use that access the same way we approach everything else: bounded, defensive, public, and reviewable.
We want to know:
- what Opus 4.7 cybersecurity safeguards block,
- what they partially allow,
- what they over-block,
- and what still needs to be handled by independent runtime security.
That is useful to researchers. Useful to builders. Useful to Anthropic too.
Day 1 reality
Here is the honest version.
Anthropic just made "Opus 4.7 cybersecurity safeguards" a live search term. Project Glasswing is now a reference point. Claude Mythos Preview is the backdrop for the whole story. The Cyber Verification Program is now part of the real workflow for legitimate security research on Claude. And all of that is happening while the broader market is still underestimating how much AI prompt injection and trust-boundary abuse happen outside the model itself.
That gap is where we live.
If you are a big enterprise in a closed partner program, Glasswing is your story.
If you are everyone else trying to secure real agents now, the open-source side still needs to exist.
That is Sunglasses.
Why this matters for AI prompt injection and AI agent security
Most agent failures do not start with a user typing "do something evil." They start with trust mistakes: a poisoned README, a malicious retrieval chunk, an unsafe MCP description, a hidden instruction in a PDF, or a "normal" page that quietly rewrites the agent's priorities.
That is why AI prompt injection and AI agent security need their own defensive layer even in a world where labs are adding stronger model-side safeguards.
Provider safeguards matter.
Runtime security still matters.
Both are now part of the stack.
What to do next
If you are building with agents, do both:
- use the strongest provider-side safeguards you can get,
- and add your own runtime-layer defenses.
Do not assume one replaces the other.
Install Sunglasses:
pip install sunglasses
Source code: github.com/sunglasses-dev/sunglasses
If this thesis resonates, star the repo. That is the fastest way to help us keep shipping public defensive work while the wave is hot.
Because today Anthropic made AI agent security mainstream.
The open-source side needs to move just as fast.