Don't sign up. Let your agent do it.

Cerver is the AI plumbing you'd otherwise write Friday at midnight — sessions, sandboxes, secrets, streaming, prompt caching. Skip the form. Run the tutorial in your agent and watch it sign you up, take your Vercel key if you want, run a real session in front of you, and tell you what it cost.

Run the tutorial → Or sign up the old way

Install the tutorial — 1 line Then say "/cerver tutorial" to your agent.

curl -fsSL https://cerver.ai/skill | bash

Vercel ⇄ Cloudflare ⇄ E2B ⇄ your laptop

→swap mid-session ·same transcript ·same API

What your agent does, when you say "/cerver tutorial".

Four phases. About two minutes start to finish. You watch.

Signs you up.

Hits POST /v2/auth/temp-signup. You get a 30-min temp key. Zero forms.

Asks for one credential.

"Want to run on your Vercel? Paste a token. Or skip and use ours." That's it.

Runs three demos.

Says hi · swaps the compute mid-session · runs the same prompt on Claude and GPT-5 side-by-side.

Tells you the bill.

Token counts and dollar estimate. Offers cerver upgrade if you want a permanent key.

Switch the brain. Switch the body. Keep the memory.

Between an intent and its execution, every layer is swappable except the one that remembers.

user user user user … × thousands

intent ▼

Intelligence ⇄ Claude · GPT-5 · Gemini · Mock

▼ runs on ▼

Compute ⇄ Vercel · Cloudflare · E2B · your laptop

▼ all wired through ▼

cerver

history · auth · billing · per-session itemised bill

One bill. One log. One API. Zoom out — same stack, same Cerver.

Who picks Cerver.

Three shapes of pain we hear most often. If one sounds like you, the demo curl up top is one paste away.

App developers

You're shipping an AI feature, not building infra to host it.

Persistent sessions per user. Streaming. Prompt caching that actually caches. Per-user usage on every turn so billing reconciles with one query. Stop maintaining a messages table.

→ A 50K-user chatbot in a SaaS, not a side project

Agent builders

Your agent shouldn't have to ask you for compute.

Sandbox is a tool the agent calls. It requests, runs, releases — alone. Fan out across eight boxes mid-conversation without writing a worker pool. Same session before and after.

→ Agents that scale themselves

Benchmarkers

Same prompt, four brains. Same task, four boxes.

Want to know if GPT-5 beats Sonnet on your real workload? Or if Vercel's cold start clears E2B's? Run the same input through Cerver and compare cost, latency, and output side-by-side.

→ Vendor decisions backed by your data, not theirs

One real bug, two ways to handle it.

A user closes the tab while a tool call is running. The next message Anthropic sees has an orphan tool_use, and every subsequent turn 400s until you reconcile the transcript. Here's what the fix actually looks like.

Without Cerver ~ 25 lines · catch + parse + repair + retry

try {
  await client.messages.create({
    model: "claude-sonnet-4-5",
    messages, tools,
  });
} catch (e) {
  const msg = e?.error?.error?.message ?? "";
  if (!msg.includes("ids were found without")) throw e;

  // Parse the orphan ids out of Anthropic's error string
  const ids = [...msg.matchAll(/toolu_[A-Za-z0-9_]+/g)]
    .map(m => m[0]);

  // Inject one synthetic tool_result per orphan
  const synthetic = ids.map(id => ({
    role: "user",
    content: [{
      type: "tool_result",
      tool_use_id: id,
      content: "aborted",
      is_error: true,
    }],
  }));

  // Retry against a now-valid transcript
  messages = [...messages, ...synthetic];
  await client.messages.create({
    model: "claude-sonnet-4-5", messages, tools,
  });
}

With Cerver 0 lines · already done server-side

// You don't write this code. At all.
// Cerver detects the orphan on its side,
// flushes synthetic tool_results, and your
// next /run-llm call just works.

await fetch(`/v2/sessions/${id}/run-llm`, {
  method: "POST",
  body: JSON.stringify({ input: "continue" }),
});

Same mechanism for sandbox lifecycle, prompt caching, transcript persistence, and per-session billing — all of which you'd otherwise also write yourself.

Where Cerver fits.

A few categories near us. Most of these solve a different problem — included so you can rule us out fast if we're not the fit.

Sessions persisted

Compute provisioned

Provider-agnostic

Free at hobby scale

Cerver

✓

Build it yourself

your DB

your code

your problem

✓

LangChain / LlamaIndex

✗

library, not service

✓

Helicone / Portkey

request log

✗

routing

✓

E2B / Vercel Sandbox direct

✗

✓

single-vendor

free tier

Same work · different compute

Run the same task on a different box.

A/B Cloudflare against Vercel against your laptop. Same transcript, same agent state — only the runtime changes.

// Mid-session: switch the compute under it
POST /v2/sessions/:id/compute
{ "compute": { "provider": "vercel" } }

// Later: swap to Cloudflare. Transcript untouched.
POST /v2/sessions/:id/compute
{ "compute": { "provider": "cloudflare" } }

Same intent · different intelligence

Run the same prompt on a different brain.

Send the same input to Claude, GPT-5, or Gemini without rewriting your prompt or your client. Compare the answers in your dashboard.

// Same session. Try Claude.
POST /v2/sessions/:id/run-llm
{ "model": "claude-sonnet-4-5",
  "input": "summarize this PR" }

// Same input. Try GPT-5.
POST /v2/sessions/:id/run-llm
{ "model": "gpt-5",
  "input": "summarize this PR" }

Half a cent per session.

A session is one create + any number of runs + close. Roughly: ~30 turns of chat, or one agent run that spawns and closes a sandbox. You see the count in your dashboard.

First 200 are on us. After that, top up $49+ in credits and draw down at half a cent per session minimum — slightly more if your session uses a bigger model. Bring your own compute keys (Vercel, Cloudflare, E2B) and your own model keys (Claude, GPT, Gemini). You always see the itemised bill.

Free

First 200 sessions, with your own keys.

200 sessions on the house
Your Vercel / Cloudflare / E2B keys
Your Claude / GPT / Gemini keys
Sessions don't disappear
Cost + latency dashboard, all providers

Top up

$49

Minimum credit purchase. ~10,000 sessions in the tank.

Drop $49+ of credits in once, run sessions against it
Same keys, same API, same dashboard
Itemised by provider, by session
More for Opus / long context — you see the math

$0.005 per session minimum, drawn from credits

Top up — $49

Teams

Talk to us

When your CISO asks where the data lives.

Everything above
SSO & multi-user roles
Audit logs, retention windows
Federated secrets (Infisical / Vault)
Dedicated support & SLA

Contact