I’ve finally been hitting caps on a new workflow that’s very agent-heavy. Last two weeks any time I would spin up many agents, I would burn through so much more, and I thought “well, that’s just what happens when you use agents”. I was wrong. I looked into the CC cache and how it works (well Claude did), and it’s actually unfortunately broken right now. Ask yours to look into this and fix it for you too. It’s a config setting, nothing fancy. In early March 2026, Unsloth published a finding that Claude Code prepends a changing attribution header — session ID, turn counter, or timestamp — to every message it sends to the model, invalidating the prefix cache on every single turn. The fix is setting CLAUDE_CODE_ATTRIBUTION_HEADER to “0” in ~/.claude/settings.json under env. Using an environment variable export doesn’t work — it has to be in the settings file to survive across shells. I fixed mine a few days ago. I’ve noticed a difference. I’m sure it’s also the recent rate cap increases too, so I have no idea how much to attribute to which change. Either way, the workflow is unblocked from the dangers of capping out. If you want to go deeper, I’d recommend claude-code-cache-fix . It’s a bit more tooling and you’re introducing a filter to your API traffic, but it appears to be compounding the token savings. I’m not affiliated, it’s just working out for me and I’m happy with it. submitted by /u/ZioniteSoldier
Originally posted by u/ZioniteSoldier on r/ClaudeCode
