When you don't know which model wins, ask all of them. Run the same intent on parallel sessions — different harnesses, different computes, different prompts. Compare. Send the winner. The others stay searchable in case the next problem looks like one of them.
1 intent · 12 runners · 1 winner · every losing session searchable forever
When you can't predict per-input which model wins. Drafting outreach. Generating ad creative. Code refactors where Sonnet sometimes nails it and sometimes overthinks. Anything where the cost of running three is cheaper than the cost of sending the wrong one.
Picking a model upfront is a guess. Curating after the fact is a choice. The choice is informed by an answer you can actually read. Most teams spend Q1 picking a provider and Q4 regretting it. Curation moves that decision down to the per-call level where it belongs.
N× tokens for the runners, then 1× human or LLM-judge to pick. For drafting use cases, N is usually 2–3 and the picker is your eyes. For high-value decisions, N can be 6+ and the picker is a cheap judge model scoring against a rubric.
Three sessions, three (harness, compute) combinations, one intent. Promise.all handles the parallelism. Your code (or a judge) picks the winner. The losers stay queryable — when next month's intent looks like one of them, you already have the answer.
// Same intent, three configurations, in parallel. const intent = "Draft a follow-up DM. Specific. No 'no agenda from me'."; const configs = [ { harness: { type: 'claude-code' }, compute: { compute_id: 'comp_local' } }, { harness: { type: 'codex' }, compute: { provider: 'vercel', credentials: env.VERCEL } }, { harness: { type: 'claude-code' }, compute: { provider: 'e2b', credentials: env.E2B } }, ]; const drafts = await Promise.all(configs.map(async (cfg) => { const { session_id } = await cerver.sessions.create(cfg); await cerver.sessions.input(session_id, { role: 'user', content: intent }); return cerver.sessions.waitForAssistant(session_id); })); // Pick by hand, by vote, by judge model — your call. const winner = drafts.sort(byJudgeScore)[0];
The losing sessions aren't deleted. They sit in your session table, queryable by intent, harness, compute. When the next similar intent arrives, the previous fanout becomes prior art.
We built this for our own LinkedIn outreach. Two parallel cerver sessions, two voice styles — curiosity-led and take-led. ~25 seconds end-to-end. We pick the one that fits the target, send it, and log the choice so the third draft for the third person is better than the first.
Same code shape as above. Different intent — drafting a DM instead of a refactor. Different curation mechanism — human eye instead of judge model. Same pattern.
Read the session API →Different tasks deserve different (harness, compute) pairs. Route each turn, not each customer.