Open source Tool: https://github.com/kunal12203/Codex-CLI-Compact Claude Code is insanely powerful, but the token usage gets out of control once you’re working on anything beyond a toy repo. I kept noticing this pattern: my prompt is small but the agent expands context massively suddenly each run is burning 80k–100k+ tokens So I built a small system (GrapeRoot) to fix this. Instead of sending full repo context, it: tracks file-level changes builds a dependency graph selects only the minimum relevant context avoids re-sending unchanged chunks Real runs (side-by-side) Same prompts. Same repo. No tricks. P1 : PagerDuty flow Normal: 95.3k tokens Optimized: 31.6k tokens Reduction: 67% P2 : passes() logic debugging Normal: 80.5k tokens Optimized: 34.4k tokens Reduction: 57% P3 : Slack 429 issue Normal: 104.2k tokens Optimized: 22.7k tokens Reduction: 78% Aggregate Normal total: 280k tokens Optimized total: 88.7k tokens Net reduction: ~68% What actually surprised me Most of the waste isn’t in your prompt. It’s from: agent reloading large parts of the repo repeated context across steps irrelevant files getting pulled in Basically, you’re paying for context you didn’t ask for. Where this breaks (important) Not perfect: misses context if dependency graph is incomplete struggles with dynamic/runtime dependencies less effective on messy or highly coupled codebases Why this matters If you’re doing multi-step workflows, this compounds fast. A single task: 5–10 agent calls each wasting ~50k tokens You’re easily burning 300k–800k tokens per task without realizing it. submitted by /u/intellinker
Originally posted by u/intellinker on r/ArtificialInteligence
