I have been using, and very much enjoying, the agent teams feature to perform intra-repo development. That being said, I am planning on downgrading from $200/mo. as I’m approaching the end of my initial project “bulking” phase and iterative changes have become more surgical. What I am hoping to do is to supplement the reduced usage limit by offloading simpler tasks (Haiku/Sonnet-level tasks) to Qwen3.6 running locally in Ollama.
I have used Claude with Ollama via ollama launch claude, but that simply overrides the model server env variable (at least from what I can tell) which makes Claude unable to switch between Opus and a local model in that same Claude session. I haven’t found conclusive evidence, but it seems that the team orchestrator agent isn’t able to spawn a team member which uses a model that isn’t available to the orchestrator itself.
Does anyone know if there’s a way, short of creating a skill/CLI tool to act as a message broker, of using a mixture of Anthropic and local LLM models?
Additional, semi-related question: does anyone else find that the team orchestrator’s context fills up very rapidly? It seems that there is a great deal of message passing between the team lead agent and the various team members and I suspect that the majority of the messages which are passed are not necessary to maintain coherence in a long-running dev session.
submitted by
/u/carbon_fire
Originally posted by u/carbon_fire on r/ClaudeCode
