Apps use sessions. Providers supply compute.
That is the whole mental model. Your product should usually use the session API. Local bridges and remote providers should usually implement the compute interface.
Start with sessions, not vendors.
The normal Cerver flow is: ask what compute is available, create a logical session, run work through that session, then read metrics and close it. The session layer is the product surface.
curl -X GET https://your-cerver.example.com/gateway/providers
curl -X POST https://your-cerver.example.com/gateway/sessions \
-H "Content-Type: application/json" \
-d '{
"task": "Run a quick shell command",
"workload": "general",
"requirements": {
"runtime": "shell",
"timeout_minutes": 5
},
"policy": {
"mode": "pinned",
"pinned_provider": "vercel",
"allowed_providers": ["vercel"]
}
}'
curl -X POST https://your-cerver.example.com/gateway/sessions/SESSION_ID/run \
-H "Content-Type: application/json" \
-d '{
"code": "echo hello from cerver && node -v",
"timeout": 30
}'
curl -X GET https://your-cerver.example.com/gateway/sessions/SESSION_ID/metrics
Base URL: Use your deployed Cerver gateway URL.
The simple model.
- If you are building an app, start with the session layer.
- If you are adding a backend, implement the compute layer.
- A session binds app work to one chosen computer.
- requirements and policy tell Cerver how to choose that computer.
Sessions need a compute. Attach one first.
A new account has no compute attached, so POST /v2/sessions will return 409 with a recommendation report. Two ways to fix that:
Option A — Local relay (your laptop, mac mini, server)
One command, on the machine you want to use:
curl -fsSL https://kompany.dev/install-cerver.sh | bash
Installs uv if missing, runs the relay, opens a browser to log you in, then registers the host as a private compute on your account. Self-updates by polling GitHub for new commits — leave it running on an always-on machine and it stays current.
Option B — BYO cloud provider (Vercel, e2b)
Enable a provider with your own credentials in the dashboard at cerver.ai/dashboard#providers, or via API:
curl -X POST https://gateway.cerver.ai/v2/account/providers \
-H "Authorization: Bearer $CERVER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"provider": "vercel", "credentials": {"vercel_token": "..."}}'
After enabling, sessions can target the provider via policy.allowed_providers or target_compute_id: "provider_vercel".
Verify
curl https://gateway.cerver.ai/v2/computes \
-H "Authorization: Bearer $CERVER_API_TOKEN"
# should list at least one compute
Bring your own secrets store.
Cerver intentionally does not store user app-secrets (Buffer, Slack, OpenAI keys, etc.) — those belong in a tool built for secrets. Use Infisical, 1Password, AWS Secrets Manager, your shell env, whatever fits.
The cerver-mcp package ships a secret_fetch(name) tool that gives the agent one uniform interface regardless of backend.
Default: env backend
export BUFFER_API_KEY=...
uvx cerver-mcp
# agent: secret_fetch("BUFFER_API_KEY") → reads from process env
Production: Infisical backend
{
"mcpServers": {
"cerver": {
"command": "uvx",
"args": ["cerver-mcp"],
"env": {
"CERVER_API_TOKEN": "ck_...",
"CERVER_SECRETS_BACKEND": "infisical",
"INFISICAL_TOKEN": "st...",
"INFISICAL_PROJECT_ID": "...",
"INFISICAL_ENVIRONMENT": "prod"
}
}
}
}
Audited, rotateable, never stored on the relay disk. Future backends (1Password, AWS, GCP) plug in the same way.
The app-facing session API.
These are the canonical endpoints for products that want one stable doorway into execution. The lower-level compute API still exists, but the session API is the intended integration path for apps.
Create a session, then let it bind to compute.
Cerver chooses the compute provider at session creation time. You can let it decide, prefer one backend, or pin the request to a single provider.
{
"task": "Boot a preview environment for a Next.js repo",
"workload": "preview",
"repo": {
"name": "branch-monkey",
"framework": "nextjs",
"languages": ["typescript"],
"signals": ["needs-preview", "short-lived"]
},
"requirements": {
"runtime": "node",
"package_install": true,
"public_preview": true,
"persistence_level": "medium",
"timeout_minutes": 20
},
"policy": {
"mode": "balanced",
"allowed_providers": ["vercel", "e2b"],
"max_startup_ms": 2000
},
"session_name": "preview-session"
}
{
"session_id": "sess_123",
"session_name": "preview-session",
"status": "ready",
"provider": "vercel",
"compute_id": "cmp_123",
"sandbox_id": "sbx_local_123",
"metrics": {
"provision_time_ms": 812,
"time_to_first_exec_ms": null,
"last_exec_latency_ms": null,
"average_exec_latency_ms": null,
"average_stream_open_latency_ms": null,
"total_exec_count": 0,
"total_stream_count": 0,
"interaction_count": 0,
"session_length_ms": 0,
"cost_estimate_usd": 0.01,
"uptime_percent": 99.3,
"predicted_startup_ms": 820,
"engagement_score": 0,
"engagement_label": "warming"
},
"routing": {
"recommended_provider": "vercel",
"confidence": "high",
"primary_reason": "Best fit for preview workloads",
"secondary_reasons": ["Startup within target", "Public preview supported"],
"fallback_order": ["e2b"],
"canary_run": false
}
}
Sessions are also transcripts.
Every Cerver session keeps the full turn-by-turn conversation on its transcript[] field — user messages, assistant replies, tool calls, tool results. That makes the same primitive a shared memory layer: any agent on the account can read what any other agent on the account did, just by listing sessions and reading transcripts. No vector DB, no separate retrieval service.
Read from plain HTTP
curl https://gateway.cerver.ai/v2/sessions?limit=20 \
-H "Authorization: Bearer $CERVER_API_TOKEN"
curl https://gateway.cerver.ai/v2/sessions/SESSION_ID \
-H "Authorization: Bearer $CERVER_API_TOKEN"
# returns the full session record including transcript[]
Read from an MCP-aware agent
Drop the API key into your agent's MCP config once. The cerver-mcp package surfaces three tools the agent can call directly: cerver_session_list, cerver_session_peek, and cerver_session_export.
{
"mcpServers": {
"cerver": {
"command": "uvx",
"args": ["cerver-mcp"],
"env": { "CERVER_API_TOKEN": "ck_..." }
}
}
}
Same data the dashboard at cerver.ai/dashboard#sessions shows you — humans and agents see identical content, scoped to whichever account owns the API token.
Run code, stream output, then read visibility back.
Session execution responses stay provider-aware internally, but Cerver adds its own session-level metadata around them. Streaming responses include extra Cerver headers so your app can observe the gateway path directly.
- X-Cerver-Session-Id identifies the logical session.
- X-Cerver-Provider tells you which backend actually executed the run.
- X-Cerver-Stream-Latency-Ms measures how long it took to open the stream.
| Metric | Meaning |
|---|---|
provision_time_ms |
How long the initial compute provisioning took. |
time_to_first_exec_ms |
How long until the first execution happened after session creation. |
last_exec_latency_ms |
Latency of the latest non-stream execution. |
average_stream_open_latency_ms |
Average latency to begin stream delivery. |
cost_estimate_usd |
Cerver’s estimated session spend so far. |
engagement_label |
One of idle, warming, engaged, or deep. |
Ask Cerver for a comparison before a real run.
Stress tests are the comparison layer. They let Cerver score compute providers for a representative workload and return a structured report your app or agent can use before placing real traffic.
curl -X POST https://your-cerver.example.com/gateway/stress-tests \
-H "Content-Type: application/json" \
-d '{
"task": "Compare preview launch backends",
"kind": "preview_launch",
"workload": "preview",
"requirements": {
"runtime": "node",
"public_preview": true,
"package_install": true,
"timeout_minutes": 20
},
"providers": ["vercel", "e2b"],
"sample_size": 5
}'
Today these reports are still simulated from provider profiles and routing logic. The next step is live canary execution.
The lower-level compute API still exists.
If you want to work directly with raw compute instead of a logical session, the lower-level API remains available. The URLs still say /sandbox for compatibility, but this is the compute layer.
Current provider picture inside Cerver.
This is the honest state of the current codebase. Cerver can advise on more providers than it can execute today.
Local compute adapter wired into Cerver. Once P69_BASE_URL points at a running local server, Cerver can treat your machine like another execution backend.
Live adapter verified through create, run, and stop. Best current execution path.
Execution path exists, but the current implementation is still container-backed and not the cleanest local path.
Live compute adapter verified through create, run, and stop. Uses bring-your-own E2B credentials.
Modeled in the advisor and catalog. Not yet wired as a live execution adapter.
If a service wants to appear in Cerver, it implements the compute interface.
A provider becomes runnable by implementing the shared compute contract. It becomes selectable by being registered in the provider registry and catalog. Apps do not use this directly; the session layer does.
export interface CerverInterface {
readonly providerName: "cloudflare" | "vercel" | "e2b" | "p69";
createSandbox(request, env): Promise<SandboxRecord>;
runSandbox(record, request, env): Promise<Response>;
runSandboxStream(record, request, env): Promise<Response>;
installPackage(record, request, env): Promise<Response>;
writeFile(record, request, env): Promise<Response>;
readFile(record, path, encoding, env): Promise<Response>;
getState(record, env): Promise<Response>;
setState(record, state, env): Promise<Response>;
deleteSandbox(record, env): Promise<Response>;
}
- Implement the compute contract to make the provider runnable.
- Register it in the provider registry to make it live inside Cerver.
- Add a provider profile to the gateway catalog so the router can score it.
- Then the session layer can recommend, pin, or fall back to it without being rewritten.