eifachposte

eifachposte

GitHub: https://github.com/kunal12203/Codex-CLI-Compact Must explore: https://graperoot.dev/ Everyone’s feed is full of it. “We processed 500M tokens this sprint.” “Our agent burned 1B last month.” Cool flex. I went the other direction — I spent the last few months obsessively figuring out how to make the same agent do the same job with as few tokens as possible. The problem I kept hitting while building production AI systems: the agent would grep the whole codebase, read half of it, then cite the wrong file anyway. On a real Go codebase (Gitea, ~1M LoC), a vanilla agent was burning 13–15 tool calls just orienting itself before writing a single line. Same pattern on TypeScript. Same on C++. The agent wasn’t bad, it just had no idea what was relevant, so it read everything and hoped. Like hiring a senior engineer who has to open every drawer in the building before answering any question. So I built GrapeRoot using claude code, a local graph indexer. It runs once on your repo, maps symbols, builds dependency + file relationship graphs, then gives the agent a surgical ~4K-token slice containing only what’s actually relevant. No per-query retrieval cost. Just graph traversal. The receipts (10 audit prompts, sonic-net/sonic-swss, 276K LoC C++, same agent both sides): Some tasks saved upto 85% cost ( including refactoring, debugging etc) Higher quality. Half the tokens. Faster. The interesting part: the agent didn’t get worse with less context, it got smarter. Because 35 of those 40 files were noise. Wrong context is worse than less context. Add it up across months of dogfooding, benchmarks, and team pilots → over a billion tokens not processed. Not “tokens burned.” Tokens that never had to be paid for, attended to, or hallucinated over. Feels like the real flex isn’t how much you burned — it’s how much you didn’t have to. I open-sourced the launcher. GrapeRoot — the full graph engine + MCP server + (Pro and enterprise features — is in early access now) Happy to share benchmark harnesses, raw transcripts, or debate AST graphs vs RAG for cross-file inference. GitHub: https://github.com/kunal12203/Codex-CLI-Compact Must explore: https://graperoot.dev/ submitted by /u/intellinker

Originally posted by u/intellinker on r/ClaudeCode

In the era of 1B-token flexing, I saved 1B tokens in Claude code!

In the era of 1B-token flexing, I saved 1B tokens in Claude code!