You can now give Hermes Agent infinite memory. The three-tier architecture is the cleanest I’ve seen in any open-source agent. The Tier 1 cap is the constraint. MEMORY md file is 2,200 chars. USER md file is 1,375 chars. Hit 80% and consolidation kicks in: the agent merges related entries into denser versions, which is lossy. The longer you run Hermes, the more your earlier context gets compressed away. Tier 2 (SQLite FTS) is unlimited capacity but every retrieval needs an LLM summarization pass. Tokens and latency on the critical path. Tier 3 is the plug-in slot. That’s where agentmemory fits. What it adds on top of the existing design: → Hybrid retrieval: BM25 + vector + knowledge graph, fused with RRF → Ebbinghaus decay so unused memories fade gracefully instead of getting consolidated out → Token-budgeted injection that keeps Tier 1 clean → Benchmarked on LongMemEval → 90% savings Same numbers as the Claude Code benchmarks: ~92% fewer tokens at 240 observations. 200x more tool calls before hitting context limits. Hermes already exposes the slot. agentmemory is the obvious thing to plug in. https://github.com/rohitg00/agentmemory submitted by /u/SeveralSeat2176
Originally posted by u/SeveralSeat2176 on r/ArtificialInteligence
