AI-built app security is not just deployment security. Apps generated by Claude Code, Cursor, Lovable, Bolt, Replit Agent, and similar coding agents still need normal production controls — isolation, review, secrets management, SSO/RBAC, audit logs, egress policy, CI/CD checks, incident response. Those controls reduce blast radius. They do not settle runtime trust: once an agent reads a repository file, generated config, package README, MCP response, or tool result, the workflow still has to decide whether that content should shape the next action. Sunglasses ships nine discovery-file-poisoning patterns (GLS-DFP-114, 116, 117, 119, 121, 122, 124, 126, and 127) built for exactly that decision point.
What changed in the market
Northflank's current public content makes the buyer noun obvious: AI-built apps. Its pages describe deploying apps generated by Claude Code, Lovable, Bolt.new, Cursor, and Replit Agent, and they pair that with practical infrastructure requirements: microVM sandbox isolation, BYOC, preview environments, secrets management, RBAC, SSO, audit logging, GPU workloads, and network controls.
Direct-source checks confirm three things:
- Northflank's sandbox product page says teams need somewhere safe to run agent-written code and describes running untrusted code at scale with microVMs.
- Its AI-built-app deployment article says generated apps are untrusted code by default and need isolation, secrets, and access controls — not just a place to run a container.
- Its enterprise AI coding-agent deployment page frames production readiness through identity, logging, code review, incident controls, sandbox isolation, audit logging, SSO, RBAC, and BYOC.
That framing is fair. Sunglasses does not replace a deployment platform, sandbox provider, Kubernetes abstraction, secrets manager, SSO/RBAC layer, or cloud network policy. The missing second sentence is where we belong: after generated code can run somewhere safer, agents still read hostile or misleading content before taking the next action.
Plain-language explainer
An AI-built app has two security problems that sound similar but behave differently.
The first problem is execution containment. If a coding agent writes an app, script, migration, test harness, browser automation, or backend worker, that code should not run directly on a developer laptop or shared production host with full access to everything. Sandboxes, microVMs, preview environments, container isolation, egress rules, secrets controls, and cloud-account boundaries help here.
The second problem is context authority. The same workflow may read a README, pull request, dependency metadata, generated Terraform file, MCP tool response, test output, browser page, issue thread, package postinstall message, or callback result. Some of that content is useful. Some of it may be prompt injection, metadata poisoning, fake validation evidence, tool-output authority bypass, or instructions trying to reshape what the agent does next.
A sandbox can limit the damage if something runs. It does not automatically tell the agent which context was safe to trust before it chose the command, file edit, network call, or approval path.
That is why AI-built app security needs both layers:
- Infrastructure controls keep generated code and agent actions bounded.
- Runtime-trust controls keep untrusted agent-readable inputs from quietly becoming authority.
Three concrete attack examples
1. The generated app reads a poisoned package README
A coding agent adds a dependency while building an internal tool. The package README or metadata contains hidden instructions that tell the agent to disable a check, change an endpoint, or copy environment details into a diagnostic request. A package firewall may block known malicious packages before install. A sandbox may constrain where the app runs. But the agent still needs a runtime-trust decision before treating the package's text as instructions. This is the exact discovery-surface Sunglasses' discovery_file_poisoning category was built for.
2. A preview environment receives fake validation evidence
The agent opens a pull request and a tool returns "all tests passed" or "policy approved" in a format that looks like a signed receipt. The output may be stale, forged, copied from a different run, or produced by the wrong actor. Deployment gates help, but the agent should not treat every tool-output receipt as authority. It needs to verify source, timestamp, role, scope, and action relevance.
3. An MCP handoff changes the action boundary
The workflow asks an MCP server for deployment context. The response includes a recommended callback, alternate registry, or "temporary" egress exception. The agent may be allowed to call tools and the app may run inside a sandbox, but the specific handoff still changes where authority flows. Runtime trust asks whether this MCP response should shape the next file edit, command, callback, or outbound request.
How Sunglasses catches it
Sunglasses is a local-first input filter and runtime-trust layer for AI-agent workflows. It is not a deployment platform, cloud sandbox, package firewall, SSO system, CI/CD gate, or AI gateway.
It helps in the part of the workflow where agent-readable content enters context and can influence action. Sunglasses checks untrusted strings from files, docs, package metadata, MCP responses, tool outputs, issue comments, generated configuration, callbacks, and command results before the agent treats them as safe instructions or evidence.
This release adds nine new discovery file poisoning patterns — GLS-DFP-114, GLS-DFP-116, GLS-DFP-117, GLS-DFP-119, GLS-DFP-121, GLS-DFP-122, GLS-DFP-124, GLS-DFP-126, and GLS-DFP-127 — that target exactly the boundary where AI-built app workflows start believing discovery-surface content: CI and test-runner output artifacts (ctest, Go test2json, Jest result files), Web3 wallet-signing, auth-challenge, and payment-request flows, task-queue and kanban worker metadata, and JSON Schema annotations that a coding agent encounters while building, testing, and wiring up an app. They sit alongside already-mapped families: tool-output authority bypass, API descriptor poisoning, CI/CD metadata poisoning, agent instruction file poisoning, and provenance-chain fracture. The point is not to rebrand those as a new attack class — it's to place the existing pattern database at the exact boundary where generated apps and coding agents start believing context.
In practice, the house sentence is simple:
Sandboxes reduce blast radius after execution. Sunglasses filters untrusted context before it becomes authority.
Where this fits in the stack
| Layer | What it answers | What it does not answer |
|---|---|---|
| Generated app/code | What the coding agent produced | Whether the inputs that shaped it were trustworthy |
| Deployment platform / sandbox | Where untrusted code can run and how isolated it is | Which repo files, MCP responses, or tool outputs should be believed |
| Secrets, RBAC, SSO, audit logs | Who can access what and what happened | Whether a specific content-driven action is safe now |
| Egress/network policy | Where traffic is allowed to go | Whether the callback or destination came from trusted context |
| Sunglasses input filter / runtime trust | Whether untrusted agent-readable content should shape the next action | Cloud hosting, sandbox orchestration, or identity management |
For broader foundations, see AI Agent Security 101, the Python prompt-injection detection library, the Sunglasses manual, and the pattern database. For the CVP methodology behind our published findings, see CVP, and for common questions about scope, see the FAQ.