Cerver is the AI plumbing you'd otherwise write Friday at midnight — sessions, sandboxes, secrets, streaming, prompt caching. Skip the form. Run the tutorial in your agent and watch it sign you up, take your Vercel key if you want, run a real session in front of you, and tell you what it cost.
curl -fsSL https://cerver.ai/skill | bash
Four phases. About two minutes start to finish. You watch.
Hits POST /v2/auth/temp-signup. You get a 30-min temp key. Zero forms.
"Want to run on your Vercel? Paste a token. Or skip and use ours." That's it.
Says hi · swaps the compute mid-session · runs the same prompt on Claude and GPT-5 side-by-side.
Token counts and dollar estimate. Offers cerver upgrade if you want a permanent key.
Between an intent and its execution, every layer is swappable except the one that remembers.
One bill. One log. One API. Zoom out — same stack, same Cerver.
Three shapes of pain we hear most often. If one sounds like you, the demo curl up top is one paste away.
Persistent sessions per user. Streaming. Prompt caching that actually caches. Per-user usage on every turn so billing reconciles with one query. Stop maintaining a messages table.
Sandbox is a tool the agent calls. It requests, runs, releases — alone. Fan out across eight boxes mid-conversation without writing a worker pool. Same session before and after.
Want to know if GPT-5 beats Sonnet on your real workload? Or if Vercel's cold start clears E2B's? Run the same input through Cerver and compare cost, latency, and output side-by-side.
A user closes the tab while a tool call is running. The next message Anthropic sees has an orphan tool_use, and every subsequent turn 400s until you reconcile the transcript. Here's what the fix actually looks like.
try { await client.messages.create({ model: "claude-sonnet-4-5", messages, tools, }); } catch (e) { const msg = e?.error?.error?.message ?? ""; if (!msg.includes("ids were found without")) throw e; // Parse the orphan ids out of Anthropic's error string const ids = [...msg.matchAll(/toolu_[A-Za-z0-9_]+/g)] .map(m => m[0]); // Inject one synthetic tool_result per orphan const synthetic = ids.map(id => ({ role: "user", content: [{ type: "tool_result", tool_use_id: id, content: "aborted", is_error: true, }], })); // Retry against a now-valid transcript messages = [...messages, ...synthetic]; await client.messages.create({ model: "claude-sonnet-4-5", messages, tools, }); }
// You don't write this code. At all. // Cerver detects the orphan on its side, // flushes synthetic tool_results, and your // next /run-llm call just works. await fetch(`/v2/sessions/${id}/run-llm`, { method: "POST", body: JSON.stringify({ input: "continue" }), });
Same mechanism for sandbox lifecycle, prompt caching, transcript persistence, and per-session billing — all of which you'd otherwise also write yourself.
A few categories near us. Most of these solve a different problem — included so you can rule us out fast if we're not the fit.
A/B Cloudflare against Vercel against your laptop. Same transcript, same agent state — only the runtime changes.
// Mid-session: switch the compute under it POST /v2/sessions/:id/compute { "compute": { "provider": "vercel" } } // Later: swap to Cloudflare. Transcript untouched. POST /v2/sessions/:id/compute { "compute": { "provider": "cloudflare" } }
Send the same input to Claude, GPT-5, or Gemini without rewriting your prompt or your client. Compare the answers in your dashboard.
// Same session. Try Claude. POST /v2/sessions/:id/run-llm { "model": "claude-sonnet-4-5", "input": "summarize this PR" } // Same input. Try GPT-5. POST /v2/sessions/:id/run-llm { "model": "gpt-5", "input": "summarize this PR" }
A session is one create + any number of runs + close. Roughly: ~30 turns of chat, or one agent run that spawns and closes a sandbox. You see the count in your dashboard.
First 200 are on us. After that, top up $49+ in credits and draw down at half a cent per session minimum — slightly more if your session uses a bigger model. Bring your own compute keys (Vercel, Cloudflare, E2B) and your own model keys (Claude, GPT, Gemini). You always see the itemised bill.