Two numbers from Anthropic’s prompt caching docs that explain most of your token bill: That’s the math: cache miss = 12.5× more expensive than cache hit for the same prefix. On a 50,000-token Claude Code session prefix (system + tools + CLAUDE.md
- early turns), the difference per turn is real money — and most users bust their cache without noticing. Anthropic publishes the exact invalidation table . Cache is built in this order: tools → system → messages . Changes at any level invalidate that level and everything after it . So not all cache busts are equal — some flush only the recent messages, others flush the entire prefix back to your tool definitions. Here are the 5 actions in Claude Code that trigger this, ordered from “nukes everything” to “trims the tail”:
- Install or remove an MCP server mid-session — busts everything Anthropic: “Modifying tool definitions (names, descriptions, parameters) invalidates the entire cache.” MCP servers register tool definitions. Adding claude mcp add or running /mcp during an active session changes the tools block at the top of every cached request. Everything downstream — system, CLAUDE.md , full conversation — gets re-written at 1.25× cost. Fix: install all your MCPs at session start. If you need a new one mid-task, finish the current task, /clear , then add.
- Switch model with /model — cache namespace changes entirely Caches are per-model. Switching from Sonnet to Opus mid-session doesn’t migrate the cache; the prefix is processed fresh on the next turn. There’s no warning in the UI. Fix: pick the model at session start. Use Opus for planning, Sonnet for execution — but split them into separate sessions, not one session you keep flipping.
- Edit CLAUDE.md while a session is open — busts system + messages CLAUDE.md content is delivered as part of the system prompt area. Anthropic’s invalidation rule: any system-level change invalidates the system cache and everything in the messages cache that built on it. Edit a single line in CLAUDE.md, save, send the next message → prefix below your CLAUDE.md gets re-written. Fix: edit CLAUDE.md between sessions, not during one. If you must edit mid-session, /clear first so you don’t pay to re-write a long conversation.
- Toggle fast mode (Shift+Tab) — busts system + messages Anthropic lists “speed setting” as a system-cache invalidator: “Switching between speed: ‘fast’ and standard speed invalidates system and message caches.” Every Shift+Tab toggle re-writes the cached prefix. Fix: pick one speed at session start and stay there. If you toggle 3 times across a session, you’ve paid the cache-write premium 3 times.
- Paste an image mid-conversation — busts messages only The lightest of the five. Per the invalidation table: “Adding/removing images anywhere in the prompt affects message blocks.” Tools and system stay cached, but the entire messages prefix is processed fresh. Fix: this one is often worth it (screenshots are high-signal). Just know that “let me drop a quick screenshot” isn’t free — you’re paying ~10% of your input bill to add it. The general rule Anthropic’s exact phrasing: “Cache hits require 100% identical prompt segments, including all text and images up to and including the block marked with cache control.” 100% identical. Not “mostly the same.” One character changes in your CLAUDE.md , you pay 12.5× to process the next turn. This is why every Anthropic doc tells you to lock your configuration at session start. Sources Prompt caching — Anthropic API docs (every quoted number is from this page) How Claude remembers your project — Anthropic Claude Code docs Best practices for Claude Code — Anthropic submitted by /u/lawnguyen123
Originally posted by u/lawnguyen123 on r/ClaudeCode
You must log in or # to comment.
