been using claude code to build a system that runs one question through 5 different LLMs in fixed roles, then synthesizes the outputs. went in thinking the model calls would be the hard part. they weren’t. the orchestration around them was. stuff that ate the most time and that claude code was actually great for: the round logic is all deterministic. when each model gets called, what context it sees, when the synthesis step fires — none of that is an LLM decision, it’s a state machine. claude code was way better at this than i expected once i stopped asking it to “design the flow” and started giving it the state transitions explicitly. keeping each model’s role isolated. the whole thing breaks if the prompts bleed into each other, so each seat needed its own strict context boundary. having claude code write the guardrails (validate inputs, enforce the role, reject off-task output) mattered more than the prompts themselves. cost/latency control. 5 model calls per request adds up fast. claude code helped me move everything to budget-tier models for the individual seats and reserve the expensive one only for the final synthesis. the thing i keep landing on: the model is the smallest part of the stack. the deterministic layer — routing, state, validation, when not to call a model — is where the actual engineering is. felt relevant to the “how model-less can agents get” stuff floating around here lately. curious how others are handling multi-model orchestration in claude code — are you letting the model drive the control flow or keeping that deterministic and using the LLM only for generation? submitted by /u/wartableapp
Originally posted by u/wartableapp on r/ClaudeCode
