Your agent says what it needs. Cerver finds compute, runs it, keeps the session. If your machine is busy, Cerver routes to the next one. If it's 3am, the session waits. Intent and execution are decoupled — your agent never thinks about infrastructure.
const session = await fetch("/gateway/sessions", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ task: "Boot a preview and run tests", requirements: { runtime: "node", timeout_minutes: 20 }, policy: { mode: "balanced" } }) }) await fetch(`/gateway/sessions/${sessionId}/run`, { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ code: "npm test", timeout: 600 }) })
{
"session_id": "sess_123",
"provider": "vercel",
"compute_id": "cmp_123",
"stdout": "test suite passed",
"metrics": {
"provision_time_ms": 812,
"last_exec_latency_ms": 1432,
"cost_estimate_usd": 0.01
}
}If your Mac is busy running 3 agents, Cerver routes the next task to your Mac Mini. If that's busy too, it goes to Vercel. You set the priority — Cerver handles the routing.
Pause a session, close your laptop, come back tomorrow. The transcript, state, and context are preserved. Resume on the same machine or a different one.
An agent running a task can request more compute mid-work. Spawn a child session for parallel tests. The parent-child relationship is tracked.
Register your laptop, your server, or your team's machines. Add Vercel or E2B with your own credentials. Cerver routes across all of them.
Create session. Run code. Pause. Resume. Spawn. That's it. Your agent talks to one endpoint — never to providers directly.
API reference and docs designed for AI agents to read. An agent can learn the API, create sessions, and manage compute without human help.
Different computers wake up differently. Cerver keeps the session API stable and reports the startup profile back as metrics.
Add cloud providers or connect your own machines via Cerver Connect.
Describe what you need — runtime, persistence, timeout. Cerver checks your compute priority and picks the first available machine.
Run code. When done, pause — compute is freed, session is saved. Resume later on the same machine or a different one. Context follows the session, not the machine.
Need more compute? Spawn child sessions. They inherit the parent's context and run in parallel. Results flow back when done.
One API key. Any compute. Sessions that persist. Agents that scale.