Do you use local models for subagents which do small tasks? It seems to be (if you have the hardware) a very good way to get a little bit more out of the usage limits. I just began to use the google/gemma-4-e4b with the Q4 quantization on my RTX 3060 12GB with a 131k context window. I will see if it’s useful or not. submitted by /u/imLostify7
Originally posted by u/imLostify7 on r/ClaudeCode
You must log in or # to comment.
