https://ultrathink.art/blog/agent-cerebro-persistent-memory Running multiple Claude Code agents, the same problem kept surfacing: every session starts fresh, agents re-discover the same issues, CLAUDE.md bloats trying to capture everything. The architecture I landed on: short-term role-specific markdown files (fast, in-context, max 80 lines) + long-term SQLite with OpenAI embeddings for unbounded storage. The key piece is cosine similarity dedup at >0.92 threshold — agents won’t store near-duplicate entries of things they already know. Wrote it up with the design tradeoffs and what belongs in each tier: Curious if others have hit this problem and what you’ve tried. submitted by /u/ultrathink-art
Originally posted by u/ultrathink-art on r/ClaudeCode
