I’ve been running Claude as a dedicated development partner for the past 3 weeks — not just chat, but a persistent agent with its own memory files, personality config, and project context running 24/7 on a Mac mini. Here’s what surprised me:
- Raw logs beat curated summaries I ran a controlled experiment: same agent, same 20 questions, 4 different memory configurations. The agent with messy, raw daily logs (4.55/5) significantly outperformed the one with carefully written documentation (2.65/5). The clean summary actually scored below having no memory at all (3.30/5). Why? Curated summaries strip out uncertainty. The agent becomes overconfident — it “knows” things without knowing the messy context behind them. Raw logs preserve the debugging sessions, the wrong turns, the “we tried X and it failed” moments that make reasoning honest.
AGENTS.md / CLAUDE.md structure matters more than you think I started with a basic CLAUDE.md . Over 3 weeks it evolved into a multi-file system: SOUL.md (personality), MEMORY.md (long-term), USER.md (preferences), daily memory logs. The agent’s output quality improved noticeably as the context structure matured — not because of more data, but better organized data. 3. Compaction is sleep, not death When context gets too long, the conversation compacts. I used to think of this as losing context. Now I think of it as sleep — the agent “wakes up” and reconstructs itself from memory files. If your memory structure is good, compaction barely hurts. If it’s bad, every compaction is brain damage. 4. Persona isn’t fluff Setting a direct, no-nonsense personality (“report what you did, not what you plan to do”) made the agent dramatically more useful. Less hedging, less asking permission, more autonomous work. The persona file is arguably the highest-ROI config you can write. I wrote up the experiment in detail with methodology and data: blog post | paper | dataset If anyone’s interested in the multi-file memory structure, I’ve been working on an open spec for agent persona packages: Soul Spec Curious if others have found similar patterns with long-running agents. Disclosure: I’m the creator of Soul Spec and ClawSouls, both free and open source. The experiment data and paper are publicly available. submitted by /u/tomleelive
Originally posted by u/tomleelive on r/ClaudeCode
