Original Reddit post

I was using expensive models for everything, so I split my setup into multiple agents (main, admin, creative, etc.), each handling a specific task. Now I only use high-end models for reasoning, and cheaper models (Haiku, GPT-4o-mini, Gemini) for simple/repetitive work. Also moved repeat tasks to n8n instead of agents. This dropped my cost around 70–90%. Any tips on reducing token usage or better ways to handle memory/context in multi-agent setups? submitted by /u/PankajKumarTechie

Originally posted by u/PankajKumarTechie on r/ClaudeCode