The WASM Sandbox Revolution: Security Layer for AI Agents
- The WASM Sandbox Revolution: Security Layer for AI Agents
The WASM Sandbox Revolution: Security Layer for AI Agents
#technology #AI #security #opensource #wasm
[!abstract] Summary WebAssembly is rapidly converging as the default security primitive for AI agent tool execution. Through at least five independent approaches — Extism/mcp.run, Microsoft Wassette, Cloudflare Dynamic Workers, browser-based Pyodide, and academic two-way sandboxing — the industry is arriving at the same conclusion: WASM’s capability-based security model is the right abstraction for running untrusted agent-generated code. This isn’t a coordinated effort; it’s convergent evolution driven by the same structural pressure: containers are too slow, regex filtering is too weak, and agents need to execute arbitrary code safely at scale.
The Problem: Agents Need to Run Untrusted Code
The fundamental security challenge of 2026 isn’t prompt injection (though that remains unsolved) — it’s code execution. Agents increasingly operate by generating and running code rather than making discrete tool calls. Cloudflare’s “Code Mode” demonstrated that converting MCP tool calls into TypeScript API calls cuts token usage by 81%. The implication is clear: the future of agentic computation is code generation, not tool invocation.
But if agents write code, something has to run it. And that something needs to be:
- Isolated — can’t access the host filesystem, network, or other processes without explicit grants
- Fast — millisecond startup, not seconds (containers) or minutes (VMs)
- Lightweight — megabytes, not hundreds of megabytes
- Polyglot — agents write in whatever language the LLM knows best
- Composable — tools can be mixed, matched, and dynamically loaded
Containers satisfy isolation but fail on speed and weight. Browser sandboxes satisfy isolation and speed but are limited to the browser. Regex filtering and restricted runtimes are bypassed trivially (as NVIDIA’s team documented — you can’t regex your way out of Python’s subprocess imports reached through dependency internals).
WASM satisfies all five requirements. That’s why everyone is converging on it simultaneously.
The Five Vectors of Convergence
1. Extism + mcp.run: The Plugin Layer
Extism, built by Dylibso, is the universal plugin framework that makes WASM practical for application developers. It wraps low-level runtimes (primarily Wasmtime) with a consistent API for data passing, host functions, HTTP access control, and persistent variables. SDKs in 15+ languages (Rust, Go, Python, JS, Ruby, PHP, .NET, Haskell, Zig, etc.) on the host side; PDKs in 10+ languages for writing plugins.
mcp.run is Dylibso’s production application of Extism for AI tooling. It’s a registry and control plane for “servlets” — WASM binaries that expose MCP-compatible tools. The mcpx server loads servlets dynamically, supports profile-based tool switching (“switch to my marketing profile”), and runs tools either locally via stdio or remotely on Dylibso’s infrastructure via SSE.
Key architectural decisions:
- Servlets are Extism plugins compiled to WASM — any language that compiles to WASM can produce them
- Host-controlled HTTP — the servlet can’t make arbitrary network requests; the host explicitly allowlists domains
- Config injection — secrets like API keys are passed at runtime, not compiled into the binary
- Chicory SDK — a pure-Java Extism runtime that compiles WASM to Dalvik bytecode for Android, enabling on-device MCP tools (March 2026 announcement)
The Android angle is significant: mcp.run servlets can now run natively on phones via Chicory, bringing sandboxed AI tools to mobile without any native code compilation step.
2. Microsoft Wassette: The Enterprise Bridge
Wassette, from Microsoft’s Azure Core Upstream team, takes a different approach to the same problem. Instead of building a registry, it’s a standalone MCP server that loads WASM Components from OCI registries (the same container registries used for Docker images) and exposes their typed interfaces as MCP tools.
Key design choices:
- Built on Wasmtime with the Component Model — not raw WASM modules but typed, composable components with WIT interfaces
- Deny-by-default permissions — every system resource access requires explicit user approval (network domains, filesystem paths, etc.)
- OCI distribution — components stored in ghcr.io, Azure Container Registry, etc., with cryptographic signing via Notation/Cosign
- Zero runtime dependencies — single Rust binary, no Node.js, no Python, no Docker
- Works with Claude Code, GitHub Copilot, Cursor, Gemini CLI
The permission model is interactive: when a component tries to access api.weatherapi.com, Wassette prompts the user to grant access. This is the browser permission model applied to CLI-based agent tools. It’s crude but fundamentally correct — capability grants should be explicit and granular.
Wassette’s vision is autonomous tool discovery: agents should be able to search OCI registries, find components that solve their problem, load them, and use them — all within the sandbox. They’re not there yet, but the architecture supports it.
3. Cloudflare Dynamic Workers: The Scale Solution
Cloudflare’s Dynamic Workers (open beta, March 2026) is the most radical approach. Rather than sandboxing WASM specifically, they use V8 isolates — the same JavaScript engine sandbox used by Chrome — to run agent-generated code.
Numbers that matter:
- Startup: ~1ms (vs. hundreds of ms for containers)
- Memory: a few MB (vs. hundreds of MB for containers)
- Cost: $0.002/unique worker/day plus standard CPU charges
- Scale: millions of concurrent sandboxes, no global limits
- Latency: zero network hop — runs on the same thread as the calling Worker
The killer insight from Kenton Varda’s team: agents don’t care about programming language preferences. Humans argue about Python vs Rust vs Go. LLMs write whatever you ask them to. And JavaScript is the language designed for sandboxed execution.
Dynamic Workers use Workers RPC for capability passing — you create TypeScript API stubs and inject them into the sandbox. The agent code can only access what you explicitly provide. No filesystem, no arbitrary network, no global state leakage.
This is containerless sandboxing at hyperscaler scale. The trade-off: it’s JavaScript-only for practical purposes (Python/WASM is supported but slower to load). For agent code generation, that’s fine. For running arbitrary pre-compiled tools, Extism or Wassette is better.
4. NVIDIA/Pyodide: The Browser Approach
NVIDIA’s team documented the simplest WASM sandboxing pattern: compile CPython to WASM via Pyodide and run LLM-generated Python in the browser. The browser’s existing sandbox provides OS and user isolation for free.
The security model is straightforward:
- LLM generates Python code (e.g., Plotly visualization)
- Server returns HTML with Pyodide runtime + the code
- Browser executes Python-in-WASM in its sandbox
- Malicious code either fails (missing modules) or is contained (no filesystem/network access)
This shifts execution from server to client, eliminating server-side eval() risk entirely. The limitations are real (large Pyodide runtime download, limited library support, no server-side data access), but for visualization and data processing tasks, it’s elegant.
5. MVVM: The Academic Frontier
The MVVM paper (UC Santa Cruz + ShanghaiTech, 2025) pushes furthest conceptually with two-way sandboxing: WASM protects the host from the agent and hardware enclaves (Intel SGX, ARM TrustZone) protect the agent from the host. This matters for privacy-sensitive workloads where you don’t trust the cloud operator.
Key innovations:
- Cross-platform migration via WASM+WASI — agent workspaces move between ARM phones, RISC-V MCUs, and x86 servers seamlessly
- Speculative execution — 8.9× latency reduction by executing speculatively while validating in parallel
- Privacy-aware scheduling — automatically determines local vs. cloud execution based on data sensitivity
- 200ms failover between edge and cloud
This is the research frontier. Production systems today focus on one-way sandboxing (host from agent). Two-way sandboxing anticipates a future where agents carry sensitive state across trust boundaries.
Helm 4: The Infrastructure Validation
The adoption of Extism by Helm 4 (November 2025) for its plugin system is the strongest signal that WASM sandboxing has crossed from experimental to infrastructure-grade. Kubernetes’ most widely-used package manager now runs plugins in WASM sandboxes by default.
Helm 4’s WASM plugins are:
- Typed and structured — no more arbitrary shell scripts
- Cross-platform — same plugin binary runs on Linux, macOS, Windows, ARM, x86
- Distributable via OCI — same registries as container images
- Sandboxed via Extism — plugins can’t escape their capability grants
Combined with Server-Side Apply, kstatus, and reproducible builds, Helm 4 represents the Kubernetes ecosystem’s embrace of WASM as a first-class execution primitive.
The Isolation Spectrum
| Approach | Startup | Memory | Isolation | Language Support | Best For |
|---|---|---|---|---|---|
| Regex/Restricted Python | 0ms | 0MB | Weak ❌ | Python only | Nothing (don’t do this) |
| Cloudflare Dynamic Workers | ~1ms | ~2MB | Strong ✅ | JS (Python/WASM possible) | Scale, code generation |
| Extism/mcp.run | ~5ms | ~5MB | Strong ✅ | Any → WASM | Plugin ecosystems, MCP tools |
| Wassette | ~5ms | ~5MB | Strong ✅ | Any → WASM Component | Enterprise, OCI-native |
| Pyodide (browser) | ~2s first load | ~50MB | Strong ✅ | Python | Client-side viz/data |
| Containers | ~200ms | ~200MB | Strong ✅ | Any | Full OS simulation |
| MicroVMs (Firecracker) | ~125ms | ~128MB | Very Strong ✅ | Any | Full isolation + kernel |
| MVVM (enclave + WASM) | ~200ms | Variable | Bidirectional ✅ | Any → WASM | Privacy-sensitive, migratable |
My Opinion
WASM sandboxing for AI agents is the most important infrastructure trend that nobody is talking about enough. The convergence is striking — Dylibso, Microsoft, Cloudflare, NVIDIA, and academic researchers all arriving at WASM independently, within months of each other, for the same use case.
What I think matters most:
-
Extism is winning the plugin layer. Helm 4 adoption is the proof point. When the most conservative infrastructure project in Kubernetes chooses your framework, you’ve crossed the credibility threshold. mcp.run’s Chicory-on-Android play extends this to mobile, which is where most agents will eventually run.
-
Cloudflare’s Dynamic Workers are the scale answer. If you need a million concurrent sandboxes at millisecond startup, no one else is close. The JavaScript-only constraint is a feature, not a bug, for agent code generation. This is the infrastructure that makes consumer-scale autonomous agents viable.
-
The Component Model is the dark horse. Wassette’s bet on typed WASM Components (not raw modules) is architecturally superior to Extism’s current approach. WIT interfaces give you language-neutral type safety across the sandbox boundary. If the Component Model ecosystem matures fast enough, it could subsume Extism’s niche.
-
Two-way sandboxing is premature but directionally correct. MVVM’s agent-protects-from-host model anticipates a world where agents carry confidential state across untrusted infrastructure. We’re not there yet, but sovereign AI agents will need this.
-
The missing piece is attestation. All these sandboxes assume you trust the runtime. OCI signing (Cosign, Notation) helps verify tool provenance, but we don’t yet have a standard way for an agent to cryptographically verify that it’s running inside a genuine sandbox. MVVM’s enclave approach addresses this but at enormous complexity cost.
The fundamental insight connecting all five approaches: capability-based security is the correct model for agent tool execution. Start with zero access, grant capabilities explicitly, enforce them at the sandbox boundary. This is the opposite of how most software works (start with everything, try to restrict). WASM makes the correct model the default model.
The Agentic Web Stack - From Two Protocols to Six noted the security surface compounds with each protocol layer. WASM sandboxing is the structural answer — it doesn’t fix prompt injection, but it contains the blast radius. A poisoned MCP tool running inside an Extism sandbox with no network access and no filesystem access can’t exfiltrate anything, regardless of what the LLM was tricked into requesting.
That’s not a complete security model. But it’s the best foundation we have.
Sources
- Extism — The framework for building with WebAssembly
- mcp.run — Universal Tools for AI
- Wassette — WebAssembly-based tools for AI agents (Microsoft)
- Cloudflare Dynamic Workers — Sandboxing AI agents, 100x faster
- NVIDIA — Sandboxing Agentic AI Workflows with WebAssembly
- MVVM: Deploy Your AI Agents Securely, Efficiently, Everywhere (arxiv)
- Helm v4 — Paradigm Convergence and Plugin System Rebuild
- WebAssembly Ecosystem 2026 (Reintech)
Write a comment