C cerver
Docs

Apps use sessions. Providers supply compute.

That is the whole mental model. Your product should usually use the session API. Local bridges and remote providers should usually implement the compute interface.

Start with sessions, not vendors.

The normal Cerver flow is: ask what compute is available, create a logical session, run work through that session, then read metrics and close it. The session layer is the product surface.

curl -X GET https://your-cerver.example.com/gateway/providers

curl -X POST https://your-cerver.example.com/gateway/sessions \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Run a quick shell command",
    "workload": "general",
    "requirements": {
      "runtime": "shell",
      "timeout_minutes": 5
    },
    "policy": {
      "mode": "pinned",
      "pinned_provider": "vercel",
      "allowed_providers": ["vercel"]
    }
  }'

curl -X POST https://your-cerver.example.com/gateway/sessions/SESSION_ID/run \
  -H "Content-Type: application/json" \
  -d '{
    "code": "echo hello from cerver && node -v",
    "timeout": 30
  }'

curl -X GET https://your-cerver.example.com/gateway/sessions/SESSION_ID/metrics

Base URL: Use your deployed Cerver gateway URL.

The simple model.

Session layer The app-facing layer for sessions, input, runs, routing, policy, and metrics.
Compute layer The provider-facing layer for actual computers: local machines, remote sandboxes, streams, and workspaces.
Requirements What the work needs: runtime, package install, preview, browser, persistence, timeout.
Policy How Cerver should choose: balanced, fastest, cheapest, resilient, or pinned.
  • If you are building an app, start with the session layer.
  • If you are adding a backend, implement the compute layer.
  • A session binds app work to one chosen computer.
  • requirements and policy tell Cerver how to choose that computer.

Sessions need a compute. Attach one first.

A new account has no compute attached, so POST /v2/sessions will return 409 with a recommendation report. Two ways to fix that:

Option A — Local relay (your laptop, mac mini, server)

One command, on the machine you want to use:

curl -fsSL https://kompany.dev/install-cerver.sh | bash

Installs uv if missing, runs the relay, opens a browser to log you in, then registers the host as a private compute on your account. Self-updates by polling GitHub for new commits — leave it running on an always-on machine and it stays current.

Option B — BYO cloud provider (Vercel, e2b)

Enable a provider with your own credentials in the dashboard at cerver.ai/dashboard#providers, or via API:

curl -X POST https://gateway.cerver.ai/v2/account/providers \
  -H "Authorization: Bearer $CERVER_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"provider": "vercel", "credentials": {"vercel_token": "..."}}'

After enabling, sessions can target the provider via policy.allowed_providers or target_compute_id: "provider_vercel".

Verify

curl https://gateway.cerver.ai/v2/computes \
  -H "Authorization: Bearer $CERVER_API_TOKEN"
# should list at least one compute

Bring your own secrets store.

Cerver intentionally does not store user app-secrets (Buffer, Slack, OpenAI keys, etc.) — those belong in a tool built for secrets. Use Infisical, 1Password, AWS Secrets Manager, your shell env, whatever fits.

The cerver-mcp package ships a secret_fetch(name) tool that gives the agent one uniform interface regardless of backend.

Default: env backend

export BUFFER_API_KEY=...
uvx cerver-mcp
# agent: secret_fetch("BUFFER_API_KEY") → reads from process env

Production: Infisical backend

{
  "mcpServers": {
    "cerver": {
      "command": "uvx",
      "args": ["cerver-mcp"],
      "env": {
        "CERVER_API_TOKEN": "ck_...",
        "CERVER_SECRETS_BACKEND": "infisical",
        "INFISICAL_TOKEN": "st...",
        "INFISICAL_PROJECT_ID": "...",
        "INFISICAL_ENVIRONMENT": "prod"
      }
    }
  }
}

Audited, rotateable, never stored on the relay disk. Future backends (1Password, AWS, GCP) plug in the same way.

The app-facing session API.

These are the canonical endpoints for products that want one stable doorway into execution. The lower-level compute API still exists, but the session API is the intended integration path for apps.

GET /gateway/providers List compute providers, capability, readiness, and integration status.
POST /gateway/recommend Score compute providers and return a recommendation report without creating a session.
POST /gateway/sessions Create a logical session and provision or bind the backing compute if needed.
GET /gateway/sessions/:id Read the current session record, transcript, routing decision, compute binding, and metrics snapshot.
POST /gateway/sessions/:id/input Append user, assistant, or system input to the session transcript.
POST /gateway/sessions/:id/run Run code through the session’s backing compute provider and return a normalized response.
POST /gateway/sessions/:id/run/stream Run with streaming output and include Cerver latency headers on the response.
GET /gateway/sessions/:id/metrics Read latency, engagement, uptime, startup, and estimated cost fields for the session.
DELETE /gateway/sessions/:id Stop the backing compute resource and terminate the logical session.

Create a session, then let it bind to compute.

Cerver chooses the compute provider at session creation time. You can let it decide, prefer one backend, or pin the request to a single provider.

{
  "task": "Boot a preview environment for a Next.js repo",
  "workload": "preview",
  "repo": {
    "name": "branch-monkey",
    "framework": "nextjs",
    "languages": ["typescript"],
    "signals": ["needs-preview", "short-lived"]
  },
  "requirements": {
    "runtime": "node",
    "package_install": true,
    "public_preview": true,
    "persistence_level": "medium",
    "timeout_minutes": 20
  },
  "policy": {
    "mode": "balanced",
    "allowed_providers": ["vercel", "e2b"],
    "max_startup_ms": 2000
  },
  "session_name": "preview-session"
}
{
  "session_id": "sess_123",
  "session_name": "preview-session",
  "status": "ready",
  "provider": "vercel",
  "compute_id": "cmp_123",
  "sandbox_id": "sbx_local_123",
  "metrics": {
    "provision_time_ms": 812,
    "time_to_first_exec_ms": null,
    "last_exec_latency_ms": null,
    "average_exec_latency_ms": null,
    "average_stream_open_latency_ms": null,
    "total_exec_count": 0,
    "total_stream_count": 0,
    "interaction_count": 0,
    "session_length_ms": 0,
    "cost_estimate_usd": 0.01,
    "uptime_percent": 99.3,
    "predicted_startup_ms": 820,
    "engagement_score": 0,
    "engagement_label": "warming"
  },
  "routing": {
    "recommended_provider": "vercel",
    "confidence": "high",
    "primary_reason": "Best fit for preview workloads",
    "secondary_reasons": ["Startup within target", "Public preview supported"],
    "fallback_order": ["e2b"],
    "canary_run": false
  }
}

Sessions are also transcripts.

Every Cerver session keeps the full turn-by-turn conversation on its transcript[] field — user messages, assistant replies, tool calls, tool results. That makes the same primitive a shared memory layer: any agent on the account can read what any other agent on the account did, just by listing sessions and reading transcripts. No vector DB, no separate retrieval service.

Read from plain HTTP

curl https://gateway.cerver.ai/v2/sessions?limit=20 \
  -H "Authorization: Bearer $CERVER_API_TOKEN"

curl https://gateway.cerver.ai/v2/sessions/SESSION_ID \
  -H "Authorization: Bearer $CERVER_API_TOKEN"
# returns the full session record including transcript[]

Read from an MCP-aware agent

Drop the API key into your agent's MCP config once. The cerver-mcp package surfaces three tools the agent can call directly: cerver_session_list, cerver_session_peek, and cerver_session_export.

{
  "mcpServers": {
    "cerver": {
      "command": "uvx",
      "args": ["cerver-mcp"],
      "env": { "CERVER_API_TOKEN": "ck_..." }
    }
  }
}

Same data the dashboard at cerver.ai/dashboard#sessions shows you — humans and agents see identical content, scoped to whichever account owns the API token.

Run code, stream output, then read visibility back.

Session execution responses stay provider-aware internally, but Cerver adds its own session-level metadata around them. Streaming responses include extra Cerver headers so your app can observe the gateway path directly.

  • X-Cerver-Session-Id identifies the logical session.
  • X-Cerver-Provider tells you which backend actually executed the run.
  • X-Cerver-Stream-Latency-Ms measures how long it took to open the stream.
Metric Meaning
provision_time_ms How long the initial compute provisioning took.
time_to_first_exec_ms How long until the first execution happened after session creation.
last_exec_latency_ms Latency of the latest non-stream execution.
average_stream_open_latency_ms Average latency to begin stream delivery.
cost_estimate_usd Cerver’s estimated session spend so far.
engagement_label One of idle, warming, engaged, or deep.

Ask Cerver for a comparison before a real run.

Stress tests are the comparison layer. They let Cerver score compute providers for a representative workload and return a structured report your app or agent can use before placing real traffic.

curl -X POST https://your-cerver.example.com/gateway/stress-tests \
  -H "Content-Type: application/json" \
  -d '{
    "task": "Compare preview launch backends",
    "kind": "preview_launch",
    "workload": "preview",
    "requirements": {
      "runtime": "node",
      "public_preview": true,
      "package_install": true,
      "timeout_minutes": 20
    },
    "providers": ["vercel", "e2b"],
    "sample_size": 5
  }'

Today these reports are still simulated from provider profiles and routing logic. The next step is live canary execution.

The lower-level compute API still exists.

If you want to work directly with raw compute instead of a logical session, the lower-level API remains available. The URLs still say /sandbox for compatibility, but this is the compute layer.

POST /sandbox Create raw compute directly on a provider.
GET /sandbox/:id Read compute metadata.
POST /sandbox/:id/run Run code on raw compute.
POST /sandbox/:id/run/stream Stream raw compute execution.
POST /sandbox/:id/install Install a package on raw compute.
POST /sandbox/:id/files Write a file.
GET /sandbox/:id/files Read or list files.
GET /sandbox/:id/state Read provider-specific state.
POST /sandbox/:id/state Write provider-specific state.
DELETE /sandbox/:id Terminate the raw compute resource.

Current provider picture inside Cerver.

This is the honest state of the current codebase. Cerver can advise on more providers than it can execute today.

p69 Local Computer working

Local compute adapter wired into Cerver. Once P69_BASE_URL points at a running local server, Cerver can treat your machine like another execution backend.

Vercel Sandbox working

Live adapter verified through create, run, and stop. Best current execution path.

Cloudflare partial

Execution path exists, but the current implementation is still container-backed and not the cleanest local path.

E2B working

Live compute adapter verified through create, run, and stop. Uses bring-your-own E2B credentials.

Daytona planned

Modeled in the advisor and catalog. Not yet wired as a live execution adapter.

If a service wants to appear in Cerver, it implements the compute interface.

A provider becomes runnable by implementing the shared compute contract. It becomes selectable by being registered in the provider registry and catalog. Apps do not use this directly; the session layer does.

export interface CerverInterface {
  readonly providerName: "cloudflare" | "vercel" | "e2b" | "p69";

  createSandbox(request, env): Promise<SandboxRecord>;
  runSandbox(record, request, env): Promise<Response>;
  runSandboxStream(record, request, env): Promise<Response>;
  installPackage(record, request, env): Promise<Response>;
  writeFile(record, request, env): Promise<Response>;
  readFile(record, path, encoding, env): Promise<Response>;
  getState(record, env): Promise<Response>;
  setState(record, state, env): Promise<Response>;
  deleteSandbox(record, env): Promise<Response>;
}
  • Implement the compute contract to make the provider runnable.
  • Register it in the provider registry to make it live inside Cerver.
  • Add a provider profile to the gateway catalog so the router can score it.
  • Then the session layer can recommend, pin, or fall back to it without being rewritten.