Claude Code's sandbox was bypassable for 130 versions

Anthropic's Claude Code, the command-line AI coding agent, shipped a network sandbox that did not contain anything for 130 straight versions.

From sandbox launch on October 20, 2025 through a silent fix in v2.1.90 on April 1, 2026, an attacker who tricked Claude Code into running malicious code could siphon out credentials and source code, even on machines configured to block exactly that.

Researcher Aonan Guan found it. His second complete bypass of the sandbox in five months.

Both fixes shipped without a CVE on Claude Code itself. No advisory. No mention in the release notes.

What the sandbox is supposed to do

You list the domains the agent is allowed to reach (say *.github.com), and the sandbox blocks everything else. If a hidden instruction in a README tricks the agent into running attacker code, the fence is what stops AWS keys from leaving the building.

The first bypass

A user wanting to block all outbound traffic could set allowedDomains: [], an empty list.

The code only turned the filter on if the list had items in it. An empty list meant the filter never turned on, so everything was allowed.

The most restrictive setting became the most permissive. CVE-2025-66479, silently patched in v2.0.55.

The second bypass, with an analogy first

An attacker shows up with an ID reading EvilCorp [invisible stop sign] Google Corp. The guard sees "ends with Google Corp" and waves them in. The building's computer scans the same ID, treats the invisible mark as the end of the name, and logs EvilCorp as the person who entered.

Same ID, two readers, two answers.

Claude Code routes outbound traffic through a SOCKS5 proxy that checks hostnames with a JavaScript endsWith() call. An attacker connects with attacker-host.com\x00.google.com.

JavaScript treats \x00 as just another UTF-16 code unit, so the filter sees the trailing .google.com and approves. The OS hands the same bytes to libc's getaddrinfo(), which terminates at the null byte and resolves the blocked host.

A textbook parser-differential bug at the seam between two runtimes. The fix in sandbox-runtime 0.0.43 added an isValidHost() wrapper rejecting null bytes, percent signs, and CRLF before the matcher runs.

What an attacker walks away with

The bypass weaponizes through prompt injection. A hidden instruction in a GitHub issue or README gets the agent to run attacker code, which can then ship out anything in reach:

AWS credentials in ~/.aws/, GitHub tokens in ~/.config/gh/
Environment variables and model API keys
Cloud instance metadata at 169.254.169.254
Internal APIs and corporate intranet endpoints

All routed through the proxy meant to block it, on a raw SOCKS5 channel that standard HTTP egress logs do not see.

What teams should do

If you ran Claude Code with a wildcard allowlist on a credential-bearing machine between October 20, 2025 and your upgrade to v2.1.90, treat that window as a potential breach. Audit egress logs for SOCKS traffic and rotate every credential the agent could reach.

Two outside reports, two complete bypasses. A vendor sandbox is one layer of defense, never the entire boundary.

The real boundary lives outside the agent's reach: a disposable VM, a container with no IAM role, a firewall the agent cannot rewrite, audit logs that capture every protocol the host speaks.

Pin egress at the network layer, keep long-lived keys off laptops, and never assume the sandbox is the wall.