SandboxVM SDK - Project Overview

~100ms Typical snapshot restore

4 Built-in templates

Pool + 1 Pooled VMs plus singleton websearch VM

SandboxVM: AI Chat Code Execution Backend Demo

Overview

SandboxVM is built for one job: safely executing AI-generated work inside real microVMs. Today that means more than raw code execution. The project now exposes template-bound environments for algorithm problems, Python data analysis with image artifacts, and a dedicated networking template for web search and page fetching.

The SDK is now template-first. Instead of asking the client to guess a runtime from a loose language string, you choose a concrete environment such as cpp-algorithm, go-algorithm, python-data-analysis, or websearch-tools. The server decides whether that template should come from a snapshot pool, a cold boot, or a long-lived singleton VM.

Architecture

┌────────────────────────────────────────────────────────────────────┐
│  Your Application  (AI agent / backend / workflow runner)         │
│                                                                    │
│   from sandboxvm import Sandbox                                    │
│   result = Sandbox.run_once(code, template="python-data-analysis") │
└───────────────────────────────┬────────────────────────────────────┘
                                │ HTTP / REST
                                ▼
┌────────────────────────────────────────────────────────────────────┐
│  SandboxVM Server  (FastAPI)                                       │
│                                                                    │
│  ┌───────────────┐  ┌────────────────┐  ┌──────────────────────┐  │
│  │ Template      │  │ Snapshot Cache │  │ VM Manager           │  │
│  │ Registry      │  │ (/dev/shm)     │  │ lifecycle / metrics  │  │
│  └──────┬────────┘  └────────┬───────┘  └──────────┬───────────┘  │
│         │                    │                     │              │
│  ┌──────▼────────┐   ┌───────▼────────┐   ┌────────▼──────────┐   │
│  │ VM Pool       │   │ Template Build │   │ Singleton Manager │   │
│  │ snapshot VMs  │   │ OCI -> ext4    │   │ websearch-tools   │   │
│  └───────────────┘   └────────────────┘   └───────────────────┘   │
└──────────────────────────────────────┬─────────────────────────────┘
                                       │ vsock / TAP networking
                                       ▼
┌────────────────────────────────────────────────────────────────────┐
│  Firecracker MicroVMs                                              │
│                                                                    │
│  cpp-algorithm / go-algorithm / python-data-analysis               │
│  websearch-tools (network-enabled, singleton, proxy-aware)         │
│                                                                    │
│  guest agent -> exec(code) -> stdout / stderr / artifacts / files  │
└────────────────────────────────────────────────────────────────────┘

Key Features

⚡ Snapshot Startup

Pooled templates restore from pre-warmed snapshots stored in /dev/shm. The server continuously refills template-specific pools so C++, Go, and Python data sandboxes are usually ready before the request arrives.

🧩 Template-First Runtime

Each template defines its own OCI image source, rootfs artifact, CPU, memory, output mode, networking policy, and warmup strategy. The SDK now targets named templates rather than relying on a generic language=... contract.

🔒 True Isolation

Every run still executes inside a real Firecracker microVM with its own Linux kernel. Host filesystems are not shared into the guest, writable paths stay inside the VM, and networking is opt-in per template instead of globally enabled.

🌐 Web Search Template

websearch-tools is a dedicated network-enabled template backed by its own rootfs and singleton lifecycle. It supports web search, URL fetching, host-proxy bridging, and search-oriented tooling without weakening the offline execution templates.

📁 Files and Artifacts

The SDK exposes both file APIs and artifact APIs. Data-analysis templates can write images into /tmp/outputs, while callers can read them back with list_artifacts() and read_artifact().

📊 Metrics and Stateful Sessions

Results include startup, compile, run, and host-side timing metrics. Within a persistent sandbox session, state still survives across multiple run() calls, so an agent can build work incrementally instead of restarting from scratch every step.

Quick Start

# Install the SDK
pip install --index-url https://test.pypi.org/simple/ \
  --extra-index-url https://pypi.org/simple \
  sandboxvm==0.1.1b3

# Run AI-generated code safely
from sandboxvm import Sandbox

ai_generated_code = """
import math
primes = [x for x in range(2, 100) if all(x % i != 0 for i in range(2, x))]
print(f"Found {len(primes)} primes: {primes[:5]}...")
"""

result = Sandbox.run_once(
    ai_generated_code,
    template="python-data-analysis",
    base_url="http://127.0.0.1:8000"
)

print(result.output)       # Found 25 primes: [2, 3, 5, 7, 11]...
print(result.metrics)      # startup / compile / run timings

Technology Stack

Firecracker Python FastAPI vsock OCI Images Linux Kernel Template Registry ext4 rootfs VM Snapshots TAP / NAT Networking