I build vexp so I spend all day watching how Claude Code behaves across different sessions. Been on 4.8 for about a week and noticed something weird: some sessions are the best I’ve ever seen. Others are worse than 4.6. Same model, same project, same prompts. Took me a few days to figure out the pattern. 4.8 follows instructions way more literally than older versions. If you give it good context with the right files and clear scope, it executes like a machine. But if the context is noisy, full of irrelevant files from the exploration phase, it takes that noise literally too. It doesn’t “figure out what you probably meant” anymore. It does exactly what the context suggests, even when the context is garbage. The other thing killing sessions right now: auto-compaction is firing at like 50% context usage. There’s a confirmed issue on GitHub about it. You’re in the middle of a multi-step task, everything is going great, and suddenly “this session is being continued from a previous conversation that ran out of context.” Your working state is gone. The compaction summary loses half the nuance. And 4.8’s literal interpretation means it reads that lossy summary and takes it as gospel. So the pattern is: 4.8 amplifies whatever context it gets. Great context = best Claude ever. Bad or compressed context = worst Claude ever. It’s not a model problem. It’s a context problem that the model makes more visible. 3 things that helped me land on the “amazing” side consistently:
- /compact manually at 40%, don’t wait for auto-compaction. The auto trigger is buggy right now and when it fires you lose context quality. Do it yourself earlier when you can control what gets kept.
- Stop letting Claude read everything. This mattered less on 4.6 because it would kind of ignore irrelevant stuff. 4.8 doesn’t ignore anything. If it reads a file, it uses it. Be explicit about scope in every prompt.
- This is the problem I built vexp to solve. It pre-filters what Claude sees based on the actual dependency graph of your project. Claude gets only the relevant files, pre-ranked, in one call instead of exploring 15-20 files blind. On 4.8 specifically this matters more than on any previous model because of how literally it interprets everything. Clean context in, precise code out. Noisy context in, precise garbage out. vexp.dev if you want to try it, free tier is 20 calls a day. Anyone else noticing this pattern on 4.8? The sessions where it’s great are genuinely incredible but the bad ones feel worse than 4.6. Curious if context quality is the variable for you too. submitted by /u/Objective_Law2034
Originally posted by u/Objective_Law2034 on r/ClaudeCode
