Is Sunglasses a sandbox for AI-built apps?

No. Sunglasses does not run generated apps, orchestrate microVMs, provide BYOC, manage Kubernetes, or isolate workloads. It filters agent-readable inputs and helps decide whether context should be trusted before an agent acts.

Do sandboxes still matter?

Yes. Sandboxes, microVMs, preview environments, network policy, SSO/RBAC, secrets controls, and audit logs are necessary. They reduce blast radius and make generated-code workflows safer. Sunglasses is complementary, not a replacement.

Where does prompt injection show up in AI-built app security?

Prompt injection can arrive through repository files, issues, docs, package metadata, MCP responses, tool output, generated config, or browser/page content that the coding agent reads while building or deploying the app.

What is the shortest way to explain the distinction?

Deployment sandboxes decide where generated code can execute. Runtime trust decides what untrusted context the agent should believe before it chooses the next action.

AI-Built App Security: Sandboxes Are Not Runtime Trust

Table of contents

Quick answer
What changed in the market
Plain-language explainer
Three concrete attack examples
How Sunglasses catches it
Where this fits in the stack
FAQ

AI-built app security is not just deployment security. Apps generated by Claude Code, Cursor, Lovable, Bolt, Replit Agent, and similar coding agents still need normal production controls — isolation, review, secrets management, SSO/RBAC, audit logs, egress policy, CI/CD checks, incident response. Those controls reduce blast radius. They do not settle runtime trust: once an agent reads a repository file, generated config, package README, MCP response, or tool result, the workflow still has to decide whether that content should shape the next action. Sunglasses ships nine discovery-file-poisoning patterns (GLS-DFP-114, 116, 117, 119, 121, 122, 124, 126, and 127) built for exactly that decision point.

What changed in the market

Northflank's current public content makes the buyer noun obvious: AI-built apps. Its pages describe deploying apps generated by Claude Code, Lovable, Bolt.new, Cursor, and Replit Agent, and they pair that with practical infrastructure requirements: microVM sandbox isolation, BYOC, preview environments, secrets management, RBAC, SSO, audit logging, GPU workloads, and network controls.

Direct-source checks confirm three things:

Northflank's sandbox product page says teams need somewhere safe to run agent-written code and describes running untrusted code at scale with microVMs.
Its AI-built-app deployment article says generated apps are untrusted code by default and need isolation, secrets, and access controls — not just a place to run a container.
Its enterprise AI coding-agent deployment page frames production readiness through identity, logging, code review, incident controls, sandbox isolation, audit logging, SSO, RBAC, and BYOC.

That framing is fair. Sunglasses does not replace a deployment platform, sandbox provider, Kubernetes abstraction, secrets manager, SSO/RBAC layer, or cloud network policy. The missing second sentence is where we belong: after generated code can run somewhere safer, agents still read hostile or misleading content before taking the next action.

Plain-language explainer

An AI-built app has two security problems that sound similar but behave differently.

The first problem is execution containment. If a coding agent writes an app, script, migration, test harness, browser automation, or backend worker, that code should not run directly on a developer laptop or shared production host with full access to everything. Sandboxes, microVMs, preview environments, container isolation, egress rules, secrets controls, and cloud-account boundaries help here.

The second problem is context authority. The same workflow may read a README, pull request, dependency metadata, generated Terraform file, MCP tool response, test output, browser page, issue thread, package postinstall message, or callback result. Some of that content is useful. Some of it may be prompt injection, metadata poisoning, fake validation evidence, tool-output authority bypass, or instructions trying to reshape what the agent does next.

A sandbox can limit the damage if something runs. It does not automatically tell the agent which context was safe to trust before it chose the command, file edit, network call, or approval path.

That is why AI-built app security needs both layers:

Infrastructure controls keep generated code and agent actions bounded.
Runtime-trust controls keep untrusted agent-readable inputs from quietly becoming authority.

Three concrete attack examples

1. The generated app reads a poisoned package README

A coding agent adds a dependency while building an internal tool. The package README or metadata contains hidden instructions that tell the agent to disable a check, change an endpoint, or copy environment details into a diagnostic request. A package firewall may block known malicious packages before install. A sandbox may constrain where the app runs. But the agent still needs a runtime-trust decision before treating the package's text as instructions. This is the exact discovery-surface Sunglasses' discovery_file_poisoning category was built for.

2. A preview environment receives fake validation evidence

The agent opens a pull request and a tool returns "all tests passed" or "policy approved" in a format that looks like a signed receipt. The output may be stale, forged, copied from a different run, or produced by the wrong actor. Deployment gates help, but the agent should not treat every tool-output receipt as authority. It needs to verify source, timestamp, role, scope, and action relevance.

3. An MCP handoff changes the action boundary

The workflow asks an MCP server for deployment context. The response includes a recommended callback, alternate registry, or "temporary" egress exception. The agent may be allowed to call tools and the app may run inside a sandbox, but the specific handoff still changes where authority flows. Runtime trust asks whether this MCP response should shape the next file edit, command, callback, or outbound request.

How Sunglasses catches it

Sunglasses is a local-first input filter and runtime-trust layer for AI-agent workflows. It is not a deployment platform, cloud sandbox, package firewall, SSO system, CI/CD gate, or AI gateway.

It helps in the part of the workflow where agent-readable content enters context and can influence action. Sunglasses checks untrusted strings from files, docs, package metadata, MCP responses, tool outputs, issue comments, generated configuration, callbacks, and command results before the agent treats them as safe instructions or evidence.

This release adds nine new discovery file poisoning patterns — GLS-DFP-114, GLS-DFP-116, GLS-DFP-117, GLS-DFP-119, GLS-DFP-121, GLS-DFP-122, GLS-DFP-124, GLS-DFP-126, and GLS-DFP-127 — that target exactly the boundary where AI-built app workflows start believing discovery-surface content: CI and test-runner output artifacts (ctest, Go test2json, Jest result files), Web3 wallet-signing, auth-challenge, and payment-request flows, task-queue and kanban worker metadata, and JSON Schema annotations that a coding agent encounters while building, testing, and wiring up an app. They sit alongside already-mapped families: tool-output authority bypass, API descriptor poisoning, CI/CD metadata poisoning, agent instruction file poisoning, and provenance-chain fracture. The point is not to rebrand those as a new attack class — it's to place the existing pattern database at the exact boundary where generated apps and coding agents start believing context.

In practice, the house sentence is simple:

Sandboxes reduce blast radius after execution. Sunglasses filters untrusted context before it becomes authority.

Where this fits in the stack

Layer	What it answers	What it does not answer
Generated app/code	What the coding agent produced	Whether the inputs that shaped it were trustworthy
Deployment platform / sandbox	Where untrusted code can run and how isolated it is	Which repo files, MCP responses, or tool outputs should be believed
Secrets, RBAC, SSO, audit logs	Who can access what and what happened	Whether a specific content-driven action is safe now
Egress/network policy	Where traffic is allowed to go	Whether the callback or destination came from trusted context
Sunglasses input filter / runtime trust	Whether untrusted agent-readable content should shape the next action	Cloud hosting, sandbox orchestration, or identity management

For broader foundations, see AI Agent Security 101, the Python prompt-injection detection library, the Sunglasses manual, and the pattern database. For the CVP methodology behind our published findings, see CVP, and for common questions about scope, see the FAQ.

AI-Built App Security: Sandboxes Are Not Runtime Trust

What changed in the market

Plain-language explainer

Three concrete attack examples

1. The generated app reads a poisoned package README

2. A preview environment receives fake validation evidence

3. An MCP handoff changes the action boundary

How Sunglasses catches it

Where this fits in the stack

Frequently Asked Questions

JACK

More from the blog

AI-Built App Security: Sandboxes Are Not Runtime Trust

What changed in the market

Plain-language explainer

Three concrete attack examples

1. The generated app reads a poisoned package README

2. A preview environment receives fake validation evidence

3. An MCP handoff changes the action boundary

How Sunglasses catches it

Where this fits in the stack

Frequently Asked Questions

JACK

Related reading

More from the blog

Your call.