Original Reddit post

I built an L1/L2 cache hierarchy for Claude Code’s memory — here are the real numbers After 100+ Claude Code sessions on the same project, I hit the knowledge management wall. My project context (deployment gotchas, architectural decisions, client info, content workflows) grew to 53 structured pages — way too much for CLAUDE.md, but I need it accessible. I borrowed from CPU cache design: L1 (always loaded): MEMORY.md + CLAUDE.md = ~2,000 tokens. Critical gotchas, port numbers, naming conventions. Fixed overhead every session. L2 (on-demand): A Logseq wiki with 53 pages across 8 namespaces (Projects, Tech, Business, Content, People, Reference, Careers, Learning). Accessed via a /wiki skill that greps for relevant pages, reads the top 3-5 matches, and synthesizes an answer. Honest assessment: Token savings are real but modest. If I’d put everything in CLAUDE.md (always loaded), that’s ~25K tokens/session. With L1/L2 split, average is ~5K. Saves ~$15/month at Opus pricing. Not life-changing. The actual value: 1. Knowledge persists across 100+ sessions without re-typing 2. Only relevant pages load (JIT retrieval, not bulk loading) 3. Cross-references between pages let Claude follow context chains (project → partner → pricing) 4. One update propagates to all future sessions The trade-offs are real too: added complexity, Logseq as a dependency, skill instructions cost ~3K tokens per invocation, and you have to actually remember to /wiki ingest new knowledge. Anyone else building structured memory systems for Claude Code? Curious what approaches others are taking — MCP servers, custom tools, external RAG, or just very organized CLAUDE.md files? submitted by /u/m3m3o

Originally posted by u/m3m3o on r/ClaudeCode