Blog
Cloudflare Ships Zero-Trust Sandbox Auth While Mythos Shows Why That Matters
Cloudflare's new credential injection architecture for sandboxes lands the same week Anthropic's exploit-hunting AI reminds us why untrusted code and secrets don't mix.
Published April 13, 2026
Cloudflare just shipped zero-trust credential injection for Containers and Sandboxes, and the timing is chef's kiss given what else landed this week. The new architecture lets platforms run untrusted workloads—user code, agentic apps, whatever—without ever handing secrets to the thing inside the box. Secrets stay out of the guest environment entirely; outbound requests get TLS-intercepted and policy-checked by the host. You can allowlist domains, inject credentials dynamically per instance, and know exactly what's trying to leave.
This matters because the default model for sandboxes has been "trust the code enough to give it a token, hope it doesn't leak." That worked fine when sandboxes ran your own build steps or lightweight test suites. It breaks the instant you let a customer upload a Python script or spin up an AI coding agent that might hallucinate a curl to some random endpoint.
Mythos and the credential exposure problem
Anthropic's Mythos AI just found thousands of zero-days across major OSes and browsers by autonomously exploiting vulnerabilities. The tool is now being distributed to about 50 orgs via a Pentagon program called Project Glasswing, and UK regulators are scrambling to figure out the blast radius. The capabilities were significant enough that Anthropic chose not to release them publicly.
The relevant detail here: AI agent credentials live in the same box as untrusted code. If your sandbox architecture gives the agent an API key to call Stripe or AWS, and that agent gets compromised or hallucinates a side channel, the blast radius is everything the key can touch. VentureBeat reported that Anthropic and NVIDIA have now shipped "the first zero-trust AI agent architectures," solving credential exposure in opposite ways. Cloudflare's model fits the same pattern: the secret never enters the sandbox, so there's no way for the workload to exfiltrate it.
How the new Cloudflare outbound model works
Cloudflare's sandbox auth announcement walks through the architecture. Outbound Workers intercept all egress traffic from the sandbox. TLS connections are terminated at the host, so the host sees cleartext and can inject headers, rewrite requests, or deny them outright based on policy. Credentials are injected after the sandbox makes a request, not before—so the guest code never sees the token in environment variables or config files.
You define per-instance policies: "this user's sandbox can talk to Stripe and Twilio, but not AWS." If the sandbox tries something off-list, the request dies at the egress layer. The sandbox itself doesn't need to know about the allowlist or the credential; it just makes a normal HTTP call and the host handles the rest.
This is particularly useful for platforms running AI coding agents. Cloudflare explicitly calls out use cases like "agents kicked off from chat messages, Kanban updates, vibe coding UIs, terminal sessions, GitHub comments." Those workflows are all "user uploads some prompt or code, we spin up a sandbox to run it." The old model required handing the agent a token upfront. The new model means the agent can call authenticated APIs without ever holding the key.
Containers go GA with SSH and performance upgrades
Separately, Cloudflare Containers hit general availability this week. Containers let you run heavier workloads on Workers—full Linux environments, CLI tools, resource-intensive apps—stuff that was awkward or impossible in the original isolate model. The GA release includes SSH support, better stability, and performance improvements since the beta. The timing lines up: Containers give you a place to run bigger agentic workloads, and the new auth model gives you a way to do it without leaking credentials.
The feature set now covers allowlists, deny lists, TLS interception, and dynamic per-instance egress policies. If you're building a platform where users ship arbitrary code or LLM-generated scripts, this is the scaffolding you need to avoid turning your sandbox into a credential buffet.
Why this matters for builders
Most platforms that run user code still rely on one of two bad patterns: (1) give the sandbox a scoped-down token and hope it stays scoped, or (2) proxy every API call through your own backend and rate-limit aggressively. The first leaks secrets into untrusted memory; the second adds latency and operational complexity.
Cloudflare's model splits the difference: the sandbox makes normal HTTP calls, the host decides what leaves and injects credentials on the fly, and the guest never sees the token. It's not a novel idea—service mesh sidecars and VPC egress proxies do similar things—but bundling it into the Workers runtime makes it accessible to smaller teams who aren't running Kubernetes everywhere.
If you're shipping anything that runs user code or agentic workloads, you now have a reference architecture for keeping secrets out of the blast radius. The fact that it shipped the same week Mythos reminded everyone what AI can do with root access feels less like coincidence and more like the industry finally catching up to the threat model.