Different tasks deserve different (harness, compute) pairs. Creative work goes to one. Analytical work to another. Cheap polish to a third. Cerver lets you compose a session chain where each step picks its own configuration — and hands state down to the next step on the same session id.
Three steps. Three reasons for three configurations. Same session id binding them, same transcript propagating context forward.
One session id · three configurations · state propagates · every step queryable forever
Multi-step product flows where each step has a different shape. Research → write → critique. Triage → solve → verify. Read PR → propose → run tests. Anywhere the right tool changes between steps.
One agent with one model on one compute eats your costs and your latency. A chain pays the premium where it matters (the slow careful step) and the discount where it doesn't (polish). The cost story routes per-turn, not per-customer.
The session. One id binds the steps. The transcript carries forward. From your app's perspective there's one user-facing conversation; under the hood, three different configurations did the work.
Each step is a fresh session create with its own harness and compute, linked to the previous one with parent_session_id. The child inherits the parent's transcript — Step 2 reads Step 1's research without you having to thread state by hand.
// Step 1 — analytical. Claude on your own AWS VPC. const research = await cerver.sessions.create({ harness: { type: 'claude-code' }, compute: { provider: 'aws', credentials: env.AWS }, }); await cerver.sessions.input(research.session_id, { role: 'user', content: 'Research the latest on agent infra at scale. Cite sources.', }); const facts = await cerver.sessions.waitForAssistant(research.session_id); // Step 2 — creative. Codex on Vercel. Inherits Step 1's transcript. const draft = await cerver.sessions.create({ harness: { type: 'codex' }, compute: { provider: 'vercel', credentials: env.VERCEL }, parent_session_id: research.session_id, }); await cerver.sessions.input(draft.session_id, { role: 'user', content: 'Draft a 400-word post using the research above.', }); const text = await cerver.sessions.waitForAssistant(draft.session_id); // Step 3 — polish. Haiku on your local machine. Cheap and fast. const polish = await cerver.sessions.create({ harness: { type: 'claude-code', model: 'haiku' }, compute: { compute_id: 'comp_local' }, parent_session_id: draft.session_id, }); await cerver.sessions.input(polish.session_id, { role: 'user', content: 'Tighten this. Remove jargon. Keep the voice.', }); const final = await cerver.sessions.waitForAssistant(polish.session_id);
Each step's session is independently queryable. If you want to know why Step 3 said what it did, the transcript is there. If you want to A/B the chain against a single-model baseline next quarter, both runs are in the same table.
When you can't predict which model wins, run them all and pick after.