SandboxVM: AI Chat Code Execution Backend Demo
Overview
SandboxVM is built for one job: safely executing AI-generated work inside real microVMs. Today that means more than raw code execution. The project now exposes template-bound environments for algorithm problems, Python data analysis with image artifacts, and a dedicated networking template for web search and page fetching.
The SDK is now template-first. Instead of asking the client to guess a runtime from a loose language string, you choose a concrete environment such as cpp-algorithm, go-algorithm, python-data-analysis, or websearch-tools. The server decides whether that template should come from a snapshot pool, a cold boot, or a long-lived singleton VM.
Architecture
┌────────────────────────────────────────────────────────────────────┐
│ Your Application (AI agent / backend / workflow runner) │
│ │
│ from sandboxvm import Sandbox │
│ result = Sandbox.run_once(code, template="python-data-analysis") │
└───────────────────────────────┬────────────────────────────────────┘
│ HTTP / REST
▼
┌────────────────────────────────────────────────────────────────────┐
│ SandboxVM Server (FastAPI) │
│ │
│ ┌───────────────┐ ┌────────────────┐ ┌──────────────────────┐ │
│ │ Template │ │ Snapshot Cache │ │ VM Manager │ │
│ │ Registry │ │ (/dev/shm) │ │ lifecycle / metrics │ │
│ └──────┬────────┘ └────────┬───────┘ └──────────┬───────────┘ │
│ │ │ │ │
│ ┌──────▼────────┐ ┌───────▼────────┐ ┌────────▼──────────┐ │
│ │ VM Pool │ │ Template Build │ │ Singleton Manager │ │
│ │ snapshot VMs │ │ OCI -> ext4 │ │ websearch-tools │ │
│ └───────────────┘ └────────────────┘ └───────────────────┘ │
└──────────────────────────────────────┬─────────────────────────────┘
│ vsock / TAP networking
▼
┌────────────────────────────────────────────────────────────────────┐
│ Firecracker MicroVMs │
│ │
│ cpp-algorithm / go-algorithm / python-data-analysis │
│ websearch-tools (network-enabled, singleton, proxy-aware) │
│ │
│ guest agent -> exec(code) -> stdout / stderr / artifacts / files │
└────────────────────────────────────────────────────────────────────┘
Key Features
⚡ Snapshot Startup
Pooled templates restore from pre-warmed snapshots stored in /dev/shm. The server continuously refills template-specific pools so C++, Go, and Python data sandboxes are usually ready before the request arrives.
🧩 Template-First Runtime
Each template defines its own OCI image source, rootfs artifact, CPU, memory, output mode, networking policy, and warmup strategy. The SDK now targets named templates rather than relying on a generic language=... contract.
🔒 True Isolation
Every run still executes inside a real Firecracker microVM with its own Linux kernel. Host filesystems are not shared into the guest, writable paths stay inside the VM, and networking is opt-in per template instead of globally enabled.
🌐 Web Search Template
websearch-tools is a dedicated network-enabled template backed by its own rootfs and singleton lifecycle. It supports web search, URL fetching, host-proxy bridging, and search-oriented tooling without weakening the offline execution templates.
📁 Files and Artifacts
The SDK exposes both file APIs and artifact APIs. Data-analysis templates can write images into /tmp/outputs, while callers can read them back with list_artifacts() and read_artifact().
📊 Metrics and Stateful Sessions
Results include startup, compile, run, and host-side timing metrics. Within a persistent sandbox session, state still survives across multiple run() calls, so an agent can build work incrementally instead of restarting from scratch every step.
Quick Start
# Install the SDK pip install --index-url https://test.pypi.org/simple/ \ --extra-index-url https://pypi.org/simple \ sandboxvm==0.1.1b3 # Run AI-generated code safely from sandboxvm import Sandbox ai_generated_code = """ import math primes = [x for x in range(2, 100) if all(x % i != 0 for i in range(2, x))] print(f"Found {len(primes)} primes: {primes[:5]}...") """ result = Sandbox.run_once( ai_generated_code, template="python-data-analysis", base_url="http://127.0.0.1:8000" ) print(result.output) # Found 25 primes: [2, 3, 5, 7, 11]... print(result.metrics) # startup / compile / run timings