Original Reddit post

TL;DR Before you type a single word, Claude Code v2.1.84 consumes 16,063 tokens of hidden overhead in an empty directory, and 23,000 tokens in a real project. Built-in tools alone account for ~10,000 tokens. This is likely why your usage “fills up faster” — it’s not that the context window shrunk, it’s that the startup prompt grew. The Investigation I kept seeing posts about context filling up faster, usage bars jumping to 50% after one message, etc. Instead of complaining, I decided to actually measure it. Environment:

  • Claude Code v2.1.84 (latest as of March 25, 2026) - Model: claude-opus-4-6[1m] - macOS, /opt/homebrew/bin/claude - Method: claude -p --output-format json --no-session-persistence ‘hello’ Results What’s Eating Your Tokens From the debug logs on a fresh session (even in an empty directory): 12 plugins loaded 14 skills attached 45 official MCP URLs catalogued 4 hooks registered Dynamic tool loading initialized In a real project, add: - CLAUDE.md files (project instructions) - .mcp.json (MCP server configs)
  • AGENTS.md, hooks, memory files, settings The point: the user-visible prompt is NOT the real prompt budget. Your “hello” arrives with 16-23K tokens of entourage already in the room. The Other Problem: Context ≠ Usage Many people are actually conflating two different things: Context limit = how much fits in the conversation window (still 1M for Max+Opus users) Usage limit = your 5-hour / 7-day API quota They feel the same when you hit them. They’re not the same system. Anthropic fixed bugs where one was displayed as the other in v2.1.76 and v2.1.78, but the confusion persists. GitHub issues confirming real bugs in this area:

#28927 : 1M context consuming extra usage after auto-update - #29330 : opus[1m] hitting rate limits while standard 200K worked - #36951 : UI showed near-zero usage, backend required extra usage - #39117 : Context accounting inconsistency between UI and /context What You Can Do Right Now –bare — skips plugins, hooks, LSP, memory, MCP. Maximum lean mode. –tools=‘’ — disabling built-in tools saves ~10,000 tokens immediately. –strict-mcp-config — ignores external MCP configurations. Keep CLAUDE.md lean — every byte of project instructions is injected into every prompt. Know the difference : /context shows context state. The status bar shows quota. They measure different things. What Anthropic Should Do Expose a first-class token breakdown at startup: system prompt, tools, MCP schemas, project instructions, hooks. If I can see what’s consuming my budget, I can manage it. Keep context and quota reporting visually distinct in the UI. Continue capping/deferring MCP and tool schemas (v2.1.81-84 already started this). Root Cause Summary The March 2026 “fills up faster” experience is real but it’s NOT a simple context window reduction. The evidence points to: Hidden prompt overhead grew — more tools, skills, plugins, hooks, MCP 1M context rollout + extra-usage policies created quota confusion Real bugs in context accounting and compaction (mostly fixed in v2.1.76-84) Anthropic didn’t silently shrink your context window. The context window got dressed up with a lot more overhead, and the quota system got confusing. Both are being actively addressed. Methodology All measurements taken with: claude -p --output-format json --no-session-persistence ‘hello’ Token counts from API response metadata ( cache_creation_input_tokens + cache_read_input_tokens ). Debug logs captured via –debug . Release notes cross-referenced from the official changelog . Update: v2.1.84 added –bare mode, capped MCP tool descriptions at 2KB, and improved rate-limit warnings — which suggests Anthropic is aware of and actively working on these issues. submitted by /u/wirelesshealth

Originally posted by u/wirelesshealth on r/ClaudeCode