Original Reddit post

Everyone’s posting about the leak. I spent the night reading the code and building things from it instead of writing about the drama. Here’s what I found useful, what I skipped, and what surprised me. The stuff that matters: CLAUDE.md gets reinserted on every turn change. Not loaded once at the start. Every time the model finishes and you send a new message, your CLAUDE.md instructions get injected again right where your message is. This is why well-structured CLAUDE.md files have such outsized impact. Your instructions aren’t a one-time primer. They’re reinforced throughout the conversation. Skeptical memory. The agent treats its own memory as a hint, not a fact. Before acting on something it remembers, it verifies against the actual codebase. If you’re using CLAUDE.md files, this is worth copying: tell your agent to verify before acting on recalled information. Sub-agents share prompt cache. When Claude Code spawns worker agents, they share the same context prefix and only branch at the task-specific instruction. That’s how multi-agent coordination doesn’t cost 5x the input tokens. Still expensive, probably why Coordinator Mode isn’t shipped yet. Five compaction strategies. When context fills up, there are five different approaches to compressing it. If you’ve hit the moment where Claude Code compacts and loses track of what it was doing, that’s still an unsolved problem internally too. 14 cache-break vectors tracked. Mode toggles, model changes, context modifications, each one can invalidate your prompt cache. If you switch models mid-session or toggle plan mode in and out, you’re paying full token price for stuff that could have been cached. The stuff that surprised me: Claude Code ranks 39th on terminal bench. Dead last for Opus among harnesses. Cursor’s harness gets the same Opus model from 77% to 93%. Claude Code: flat 77%. The harness adds nothing to performance. Even funnier: the leaked source references Open Code (the OSS project Anthropic sent a cease-and-desist to) to match its scrolling behavior. The closed-source tool was copying from the open-source one. What I actually built from it (that night):

  • Blocking budget for proactive messages (inspired by KAIROS’s 15-second limit)
  • Semantic memory merging using a local LLM (inspired by autoDream)
  • Frustration detection via 21 regex patterns instead of LLM calls (5ms per check)
  • Prompt cache hit rate monitor
  • Adversarial verification as a separate agent phase Total: ~4 hours. The patterns are good. The harness code is not. Full writeup with architecture details: https://thoughts.jock.pl/p/claude-code-source-leak-what-to-learn-ai-agents-2026 submitted by /u/Joozio

Originally posted by u/Joozio on r/ClaudeCode