Original Reddit post

Using Opus 4.8. Was fine a couple of days ago. Recently though, I notice the hidden thinking token count arriving(which signals that the model is actually thinking) is super late. Could wait for 1 minute and nothing happens. Using a fresh context the hidden thinking token count arrive almost immediately. Previously this was not the case(using like 300k or so context in the 1 m Opus 4.8 is still quick, after each round). Feels like it’s a problem with cache hit. What is going on with anthropic’s engineering? Anyone else with the same issue, or just me🥲 submitted by /u/True_Independent4291

Originally posted by u/True_Independent4291 on r/ClaudeCode