Original Reddit post

ok so I need to vent a bit and also genuinely ask if anyone else is dealing with this we run a startup with an app that has thousands of users. small team, me (co-founder), my co-founder and two other devs. we’ve been using claude code heavily for the past few months and honestly it changed how we work… but the bill is getting real. we’re also using claude code for completely different stuff. like design in blender with mcp, and running marketing agents. last month we crossed $3k total. and like… the results are there. we ship faster. but $3k/month for a team of 4 at a startup is not nothing. the worst part is I can feel it affecting how I work. like I’ll hesitate before asking claude to help with something because I’m thinking “is this worth $0.50?”. or I’ll try to batch things together in one prompt to save tokens and then the output is worse because the context is too packed. has anyone found a good way to manage this? I’ve thought about: setting up some kind of local model for the simple stuff (renames, formatting, small fixes) and only sending the complex architecture/debugging tasks to claude having dedicated local machines running qwen A10B (that looks very promising) or qwen A3B coder and having a router mix the models to save tokens for all the team putting spending caps per person per week just accepting it as the cost of being competitive in 2026 the local model idea is interesting to me but I have no idea if it actually works in practice. like can a 7B model running on decent hardware actually handle the boring coding tasks well enough that you’d trust it? curious what other teams are doing. especially if you’re in a similar situation where it’s not just devs using it but the whole team across different workflows. submitted by /u/sandropuppo

Originally posted by u/sandropuppo on r/ClaudeCode