Original Reddit post

I kept running into the same issue on larger projects, Claude would burn through tokens hunting for the right files, reading stuff it didn’t need, missing stuff it did. On a 500-file Go backend it was using 50-200K tokens per query on file hunting. I know other tools are out there but I wanted something a bit lighter and honestly wanted to just make something myself. So I built attnroute. It’s a set of Claude Code hooks that maintain a “working memory” of your codebase and inject only what’s relevant:

  • HOT files (actively working on) -> full content
  • WARM files (related context) -> symbols/signatures only
  • COLD files (not relevant right now) -> evicted entirely Before: 50 to 200k tokens per query After: 2,027 tokens per query 90% reduction in 309ms How it works:
  • Tracks which files you access and lets them decay over time (recent = hot, old = cold)
  • Learns co-activation patterns (files used together get linked)
  • Uses tree-sitter + PageRank on your dependency graph to rank file importance
  • Fits everything within a token budget The repo mapping approach is heavily inspired by Aider 's tree-sitter + PageRank work — credit to Paul Gauthier for pioneering that. It also ships with three plugins that address real Claude Code pain points:
  • LoopBreaker — detects when Claude is stuck repeating the same failing approach (#21431)
  • VerifyFirst — enforces read-before-write so Claude doesn’t speculatively edit files it hasn’t read (#23833)
  • BurnRate — tracks your token consumption rate and warns before you hit limits (#22435) Install: pip install attnroute[all] attnroute init Then type /hooks and approve. That’s it , works immediately, no restart. Zero required dependencies. Optional extras for tree-sitter graph analysis, BM25 search, and semantic search via ChromaDB. MIT licensed. Being transparent about limitations: the file predictor hits about 0.35-0.42 F1 (Precision ~45%, Recall ~60%). That’s fine because token reduction is the goal, not perfect prediction – even imperfect prediction massively beats injecting your whole codebase, and Claude can still Read files on demand. GitHub: https://github.com/jeranaias/attnroute PyPI: https://pypi.org/project/attnroute/0.5.3/ Would love feedback from anyone who tries it on a larger codebase. Solo dev project so bug reports are genuinely appreciated. submitted by /u/jcmguy96

Originally posted by u/jcmguy96 on r/ClaudeCode