For platform engineering

Save your team a year of AI plumbing.

Every product team is shipping AI on its own session store, compute pool, and secret manager — work you'd normally have to consolidate later. Cerver is that consolidation, ready to use. Your team spends its quarters on policy, audit, and cost — not on a fifth implementation of the same primitives.

Four contracts. One API.

The duplication tax stops. Governance moves from review meetings to JSON schemas — and reliability is part of the contract, not an afterthought.

01 · portability

Swap compute mid-session

POST /v2/sessions/:id/compute with { compute: { compute_id } }. Same session_id, same transcript. Vendor migration is a config flip, not a refactor of every team's wrapper.

02 · reliability

Reliable by contract

Auto-failover on /run and /run/stream when a compute drops. Idempotent retries on /input. Three normalized statuses (running · ready · ended) plus endReason on every closed session. Stale sessions auto-promote so dashboards aren't graveyards.

03 · secrets

Secrets the LLM never sees

Provider keys live in your secret backend (Infisical default, pluggable). Cerver fetches via secret_fetch(name) server-side, attaches to the upstream call, returns provider + usage — never the value. One audit trail across teams.

04 · accounting

Receipts on every turn

Every response carries usage { input_tokens, output_tokens, cache_hit, model, compute_provider }. Tag sessions with metadata.{ team, product, cost_center }. Your cost report is one query on the cerver session export — not a migration of three logging stacks.

One problem. Many runners. Keep the winner.

Most teams pile every task into one harness, then context-switch through ten unrelated half-fixes — slower output, more mistakes. Cerver flips it: one task at a time, several providers in parallel, the best output wins.

From

Ten tasks, one harness.

Context-switching across unrelated problems on the same model. No signal which one is the bottleneck. Mistakes pile up; throughput drops.

To

One task, three runners.

Same prompt, side-by-side outputs from Sonnet + Opus + Haiku — or Claude Code vs Codex CLI. Pick by quality, latency, cost, on receipts not preferences.

switch model mid-session switch compute mid-session spawn sibling sessions

POST /v2/sessions three times with the same task and different harness. Compare the three transcripts and usage blocks via the export API. No benchmark stack to maintain — it's three rows in the cerver session listing, sliced by the comparison_group metadata you set on the POST.

The plumbing you don't have to write.

Register a compute once. Every product team consumes it through the same auth, same metadata schema, same dashboards.

curl · platform setup

# 1. Register a compute against the org's E2B account.
curl -X POST https://gateway.cerver.ai/v2/computes \
  -H "Authorization: Bearer $CERVER_ADMIN_KEY" \
  -d '{ "provider": "e2b", "scope": "shared", "label": "shared-e2b" }'

# 2. A product team creates a session against it. Audit-friendly metadata.
curl -X POST https://gateway.cerver.ai/v2/sessions \
  -H "Authorization: Bearer $TEAM_KEY" \
  -d '{ "session_name": "search-rerank-prod",
       "compute": { "compute_id": "comp_..." },
       "metadata": { "team": "search", "cost_center": "growth", "env": "prod" } }'

# 3. You query usage by team, by week, by anything in metadata.
curl "https://gateway.cerver.ai/v2/sessions?team=search&before=2026-05-01" \
  -H "Authorization: Bearer $CERVER_ADMIN_KEY"

Talk to us → Read the API