Original Reddit post

You know that specific frustration where you ask Claude something about your project and it gives you a textbook-correct answer that has absolutely nothing to do with how your codebase is actually structured? That was my life for months. I’d paste files. Claude would answer confidently. Half the time it was describing patterns my project doesn’t even use. I’d paste more files. It’d get confused by earlier context. I’d start a fresh session. We’d be back to square one. The problem isn’t Claude. Claude is genuinely good. The problem is that every session, Claude walks into your project completely blind — and you’re the one responsible for catching it up. I built Token Reducer because I was sick of being my own context manager. github.com/Madhan230205/token-reducer Here’s what actually happens under the hood It indexes your codebase once — AST parsing, not just ctrl+F style text matching. Tree-sitter reads your code the way a compiler would. It knows what’s a function, what’s a class, what imports what. Then when you ask Claude something, before your message even reaches Claude, it: Finds the semantically relevant chunks using keyword + vector search combined Traces the import graph to pull in file dependencies automatically Does 2-hop symbol expansion — if your question touches process_payment() , it follows what that function calls too, without you asking Runs TextRank scoring (same graph math as early PageRank, applied to code chunks) to figure out what to keep and what to drop By the time Claude sees your question, it’s already holding the 3–5 files that actually matter. Not your whole repo. Not a random slice. The right slice. On my 40k line Python monorepo, this went from Claude giving me generic “here’s how you might structure auth” answers to Claude saying “in your UserSession.validate() method, line 47, you’re not invalidating the refresh token on logout — here’s the fix.” That shift in specificity is what I was chasing. How to install (Claude Code, 2 commands): /plugin marketplace add Madhan230205/token-reducer /plugin install token-reducer@Madhan230205-token-reducer Then just ask it things: /token-reducer “walk me through the payment flow” /token-reducer “where does user input get sanitized?” /token-reducer “why is the background job failing?” Everything runs on your machine. No API keys. No data going anywhere. Indexes in ~350ms, queries answer back in under 30ms. If you hate installing ML dependencies (tree-sitter, sentence-transformers, the whole zoo), there’s a zero-dependency mode that falls back to hash embeddings and regex chunking. Less precise but it literally just works out of the box: bash git clone https://github.com/Madhan230205/token-reducer.git python scripts/context_pipeline.py run --inputs ./src --query “find auth logic” --embedding-backend hash --db .cache/index.db What I benchmarked (ran it against the repo itself, 51 files, ~35k tokens): Average query time: 27ms Compression ratio: 5.83x Savings per 100 queries at Sonnet pricing: ~$34 I’ll be honest — the cost thing is real but it’s not why I built this. I built this because I wanted to stop being the middleman between Claude and my own code. What it won’t fix: If your code is a mess, Claude’s answers will reflect that. This isn’t magic — it’s retrieval. Garbage in, slightly better organized garbage out. It also won’t fix Claude confidently saying wrong things. But it does dramatically reduce the cases where Claude is wrong because it didn’t have the right context , which in my experience is most of them. Tech stack if you’re curious: SQLite FTS5, HNSW via hnswlib, Jina Code v2 embeddings, Tree-sitter for parsing. MIT licensed, no telemetry, open to contributions. Drop a comment if you’ve hit this problem — curious whether others have found workarounds I haven’t tried. submitted by /u/Low_Stomach3065

Originally posted by u/Low_Stomach3065 on r/ClaudeCode