Original Reddit post

Was running some experiments with the output config: effort level setting in the Claude Messages API with prompt caching and discovered something strange. When you change effort level in a multi turn conversation, the new request can only access the cache written by the same effort level request previously, for both system prompt as well as messages level cache. For example: Turn 1: effort high, system prompt (cache breakpoint CB) + turn 1 user message (CB) passed => both CB written to cache Turn 2: effort low, system prompt (CB) + turn 1 user (CB) + Turn 1 assistant + turn 2 user (CB) passed => system prompt + messages array cached again (no cache read) Turn 3: effort high, system prompt (CB) + turn 1 user (CB) + turn 1 assistant + turn 2 user (CB) + turn 2 assistant + turn 3 user (CB) passed => first 2 CB that were written in turn 1 are read, the rest is re written to cache I tried looking in the documentation to check whether this behaviour is expected or some kind of bug, and I couldn’t find anything. Does anyone here know whether this is expected behaviour? Should I raise an issue with anthropic about this? For reference: all 3 turns used sonnet 4.6 with adaptive thinking and the same system prompt and max tokens, no tools. submitted by /u/MediumChemical4292

Originally posted by u/MediumChemical4292 on r/ClaudeCode