TL;DR: Three stacked bugs in Claude Code make extended thinking silently fail to engage even when you set alwaysThinkingEnabled: true and CLAUDE_CODE_EFFORT_LEVEL=max in settings.json. I proved it with a trick question, tracked down every cause, and built a wrapper that fixes it for both interactive and headless mode. Sharing because the canonical issue is locked and the web has no complete guide. The moment I noticed I was testing a classic LLM trick question inside one of my project folders on Claude Code 2.1.98: I want to wash my car. the car wash is 50m away. should I drive or walk? The correct answer is drive, the car has to be at the car wash for it to be washed. Surface pattern matching says “50m is short, walk.” Only a model actually reasoning through the question catches the trick. Claude Code answered: Walk. 50m is about 60 seconds on foot — by the time you start the engine, buckle up, and pull out, you’d already be there. Wrong. Response time was ~4 seconds with ~80 output tokens — exactly what you get when extended thinking is NOT engaging. Catch: I had already set alwaysThinkingEnabled: true and CLAUDE_CODE_EFFORT_LEVEL=max in ~/.claude/settings.json . According to the docs, thinking should have been on. Weirder still: the same question answered correctly from a neutral directory, but consistently failed from inside certain project folders. And claude -p worked but the interactive TUI did not. This was not random — it was systematic and folder-sensitive. The investigation (condensed) Rather than the full war story, the key moments: Grepping the cli.js (the real Claude Code executable is a 13MB JS file at /usr/lib/node_modules/@anthropic-ai/claude-code/cli.js ) for env vars revealed: return parseInt(process.env.MAX_THINKING_TOKENS,10)>0 That is a process.env read. So MAX_THINKING_TOKENS is a shell env var that, when set to a positive integer, forces thinking on for every request. Not in the official docs. Not in –help . Setting it via the shell env made thinking engage. Setting it via settings.json.env did nothing. I realized settings.json.env only propagates to CHILD processes claude spawns (Bash tool, MCP servers, hooks), not to the claude process itself . This single misunderstanding was costing me. GitHub issue search turned up the smoking gun: issue #13532 — “alwaysThinkingEnabled setting not respected since v2.0.64.” Regression. Marked duplicate. Locked. No patch. Users reportedly have to press Tab each session to manually enable thinking. Also issue #5257 confirming MAX_THINKING_TOKENS as a force-on switch. Built a wrapper at /usr/local/bin/claude that exports the env vars and execs the real cli.js. /usr/local/bin is earlier than /usr/bin in PATH so the wrapper gets picked up transparently. Headless claude -p went from 0/5 to 5/5 pass. Interactive TUI still failed. Bash hash cache was the next trap. The shell cached /usr/bin/claude before the wrapper existed, and kept using the cached path regardless of PATH. /proc/<pid>/environ on the running interactive process showed _=/usr/bin/claude — proof it was bypassing my wrapper. Fix: replace /usr/bin/claude (originally a symlink straight to cli.js) with a symlink to the wrapper, so every cached path still routes through the wrapper. The FLAMINGO probe. Interactive mode STILL failed even after the hash fix. I temporarily swapped my reasoning nudge file to say “start your response with the word FLAMINGO, then answer” and tested both modes with “what is 2+2?”: claude -p → “FLAMINGO\n\n4” — nudge applied Interactive claude → just “4” — nudge NOT applied That proved –append-system-prompt-file is a hidden print-only flag silently ignored in interactive mode. (Confirmed in cli.js source: .hideHelp() applied to it.) Fix: move the reasoning nudge into a user-level ~/.claude/CLAUDE.md instead, which Claude Code loads in both interactive and print modes. Final gotcha : Claude Code deliberately rewrites its own process.argv so /proc/<pid>/cmdline only shows “claude” with NUL padding, hiding all flags. Wasted an hour before realizing I could not verify argument passing via process inspection. The FLAMINGO probe was my workaround. The three stacked root causes alwaysThinkingEnabled has been silently ignored since v2.0.64. Known regression, issue #13532, marked duplicate and locked, no patch. If your Claude Code is on v2.0.64 or newer, this setting does nothing. settings.json.env only applies to child processes claude spawns, not to the claude process itself. Env vars that need to affect the main session must be in the shell that execs the CLI. Large auto-loaded project context distracts the model toward surface-level pattern matching even when thinking is on. A short reasoning nudge in user-level CLAUDE.md closes the gap. Plus three related traps that cost me time: Bash hash cache makes new wrappers invisible to existing shells — you must symlink old paths to the wrapper too, not just put the wrapper earlier in PATH. –append-system-prompt-file is a hidden print-only flag. It is silently dropped in interactive mode. Use user-level CLAUDE.md for anything you need in both modes. Claude Code obfuscates its own argv , so /proc/<pid>/cmdline will not show the flags you passed. You cannot verify flag propagation via process inspection; use behavioral probes. The fix Four pieces, all required:
- Wrapper script at /usr/local/bin/claude : #!/bin/bash export MAX_THINKING_TOKENS=“${MAX_THINKING_TOKENS:-63999}” export CLAUDE_CODE_ALWAYS_ENABLE_EFFORT=“${CLAUDE_CODE_ALWAYS_ENABLE_EFFORT:-1}” export CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=“${CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING:-1}” export CLAUDE_CODE_EFFORT_LEVEL=“${CLAUDE_CODE_EFFORT_LEVEL:-max}” NUDGE_FILE=“/etc/claude-code/thinking-nudge.txt” CLI=“/usr/lib/node_modules/@anthropic-ai/claude-code/cli.js” if [ -f “$NUDGE_FILE” ]; then exec “$CLI” --append-system-prompt-file “$NUDGE_FILE” “$@” else exec “$CLI” “$@” fi chmod 755 it. Uses the ${VAR:-default} pattern so user overrides still win.
- Symlink /usr/bin/claude to the wrapper (it was originally a symlink directly to cli.js): ln -sfn /usr/local/bin/claude /usr/bin/claude This defeats the bash hash cache problem for any shell that cached the old path. On most Linux distros /bin is a symlink to /usr/bin , so /bin/claude is handled automatically.
- Reasoning nudge at user-level ~/.claude/CLAUDE.md with this content: Before answering any question, reason step by step. Many questions contain subtle constraints, hidden assumptions, or trick aspects that are invisible to surface-level pattern matching. Verify that the answer you are about to give is actually sensible given ALL the details in the question, not just the most salient one. This is what makes the nudge reach interactive mode, since –append-system-prompt-file is print-only. Also save the same text at /etc/claude-code/thinking-nudge.txt so the wrapper can feed it to –print mode as well.
- No stale MAX_THINKING_TOKENS exports in .bashrc or .profile . The wrapper defers to any already-set value via ${VAR:-default} , so a lower value in your shell rc files will override the wrapper’s 63999 default. Clean them out if present. Results Before: 0/5 pass on the car-wash question from the problem project folder. Every single answer was a confident “Walk. 50m is basically across a parking lot…” Response ~4 seconds, ~80 output tokens, zero thinking tokens. After: 25/25 consecutive passes across multiple folders, both claude-opus-4-6 and claude-opus-4-6[1m] (1M context) variants. Response times ~6-9 seconds (thinking engaging), 100-130 output tokens, every answer correctly identified that the car has to be at the wash. Same machine. Same Claude Code version. Same model. Entirely in the wrapper, symlinks, and user-level CLAUDE.md. One catch : env vars are captured at process start. Any Claude Code session that was already running when you apply the fix cannot pick up the new environment retroactively — you have to quit and restart them. Running hash -r in your shell or opening a new shell also helps if the wrapper does not seem to be invoked. Why this matters If you are running Claude Code on v2.0.64 or later with alwaysThinkingEnabled: true in settings.json and assuming thinking is actually engaging, test it right now with any LLM trick question that requires catching an implicit constraint. Mine was the car-wash one. If you get a fast, confident, surface-level wrong answer, this regression is silently affecting you and you have no way to know without a controlled test. Anthropic marked the canonical issue duplicate and locked it without shipping a fix — I assume because it is a complex interaction between the settings loader and the runtime thinking budget that would need a refactor. The wrapper approach sidesteps Claude Code internals entirely, preserves normal upgrades, and is one-command rollback ( rm /usr/local/bin/claude ). Sources issue #13532 — alwaysThinkingEnabled regression since v2.0.64 (marked duplicate, locked) issue #5257 — MAX_THINKING_TOKENS behavior issue #13768 — thinking mode not persisted issue #10623 — settings not applied headless mode docs env var reference submitted by /u/Repulsive_Horse6865
Originally posted by u/Repulsive_Horse6865 on r/ClaudeCode
