Original Reddit post

Freelance dev here, work mostly on backend stuff, react frontends occasionally. I run Claude Code on API billing rather than the subscription plan, so everything is pay-per-token. It has become the center of my coding workflow over the past few months, but i stopped using one model for every single task. The setup is a little messier than the clean screenshots people post, but it has worked well enough that i thought i’d share the actual version. Claude Sonnet in Claude Code for anything that needs real repo context. Architecture stuff, untangling a gnarly bug across multiple files, deciding how to structure a new module. The long context helps a lot when i can point it at 4 or 5 related files and ask “what’s the cleanest way to add X without breaking Y”. The responses are slower and more expensive per token but the quality on this kind of work is noticeably better than what i get from cheaper models. A cheaper GPT model for fast iteration outside the main repo loop. Quick syntax questions, “rewrite this loop as a list comprehension”, “what’s the regex for X”, “fix this typescript error”. It’s fast, it’s cheap enough that i don’t think about it, and for these small focused asks the quality difference vs claude isn’t worth the latency. DeepSeek for code review and refactor suggestions. Honestly didn’t expect this one to work. But for “look at this PR diff and tell me what i missed” type tasks, the reasoning model has been surprisingly thoughtful and cheap enough that i can run it on every commit if i want without thinking about cost. Occasionally hallucinates an api that doesn’t exist, but it’s caught real bugs for me too. Logistics-wise this used to be a pain. Claude Code stayed on the Claude side, but my helper scripts had three providers, three keys, three sdks, and a zsh history full of curl commands going to different endpoints. Recently i pointed the scripts through tokenrouter so i only deal with one endpoint and one dashboard locally instead of three separate api configs. Mostly i just wanted to stop juggling dashboards. There are other gateways that do similar things, like litellm if you want to self host or openrouter for a hosted option. Pick whichever annoys you least. The tooling integration matters more than i used to think. Claude Code gets the real project work. The cli scripts are for narrow jobs where i know exactly what model i want. One annoying gotcha: when you’re routing through a gateway, the streaming event shapes can be slightly different from hitting the provider direct. Specifically if you’re parsing tool_use blocks from claude streams in your own scripts, the boundaries can come in as a slightly different sse event type via the gateway. Burned an afternoon on it before i normalized the parser. I’ve been doing it long enough now that i can usually predict which model will do better at a given task before i ask it, and i think that’s actually the most useful skill, not the tooling. Claude Code is still where the serious work happens for me. The cheaper side paths just keep me from burning expensive tokens on tiny questions. submitted by /u/RevealNoo

Originally posted by u/RevealNoo on r/ClaudeCode