Free Tool: https://grape-root.vercel.app/ Github Repo: https://github.com/kunal12203/Codex-CLI-Compact Join Discord for (Debugging/feedback) I’ve been deep into Claude Code usage recently (burned ~$200 on it), and I kept seeing people claim: “90% cost reduction” Honestly — that sounded like BS. So I tested it myself. What I found (real numbers) I ran 20 prompts across different difficulty levels (easy → adversarial), comparing: Normal Claude CGC (graph via MCP tools) My setup (pre-injected context) Results summary: ~45% average cost reduction (realistic number) up to ~80–85% token reduction on complex prompts fewer turns (≈70% less in some cases) better or equal quality overall So yeah — you can reduce tokens heavily. But you don’t get a flat 90% cost cut across everything. The important nuance (most people miss this) Cutting tokens ≠ cutting quality (if done right) The goal is not:
- starve the model of context
- compress everything aggressively The goal is:
- give the right context upfront
- avoid re-reading the same files
- reduce exploration , not understanding Where the savings actually come from Claude is expensive mainly because it: re-scans the repo every turn re-reads the same files re-builds context again and again That’s where the token burn is. What worked for me Instead of letting Claude “search” every time: pre-select relevant files inject them into the prompt track what’s already been read avoid redundant reads So Claude spends tokens on reasoning , not discovery . Interesting observation On harder tasks (like debugging, migrations, cross-file reasoning): tokens dropped a lot answers actually got better Because the model started with the right context instead of guessing. Where “90% cheaper” breaks down You can hit ~80–85% token savings on some prompts. But overall: simple tasks → small savings complex tasks → big savings So average settles around ~40–50% if you’re honest. Benchmark snapshot (Attaching charts — cost per prompt + summary table) You can see: GrapeRoot consistently lower cost fewer turns comparable or better quality My takeaway Don’t try to “limit” Claude. Guide it better. The real win isn’t reducing tokens. It’s removing unnecessary work from the model If you’re exploring this space Curious what others are seeing: Are your costs coming from reasoning or exploration? Anyone else digging into token breakdowns? submitted by /u/intellinker
Originally posted by u/intellinker on r/ArtificialInteligence
