Original Reddit post

Please help, idk but the past week and half Claude has been unstable to say the least: Claude reads instructions, says it understands them, and then doesn’t follow them routinely. It redefines success despite giving extremely explicit testable requiremnts, cuts around hard parts, silently jumps around the wrong work trees, presents partial results as progress and even tries to DELETE THE FAILING TESTS🤯. It’s gotten to the point where virtually every past session, Claude will read the requirement deliverables and non negotiable I give it, even quotes them back and then blatantly proceeds to do the thing it was told not to do or skip the thing it was told to do. Things tried so far and failed:

  • Writing it in CLAUDE.md
  • rewriting entire Claude.md
  • reducing task scope
  • using different effort levels(both sonnet and opus at all effort levels)
  • auditing codebase
  • reorganizing codebase for being agentic friendly
  • starting projects over from scratch/refactoring(on my 4th or 5th atp)
  • using custom and popular specialization skills
  • reducing task scope
  • minimizing initial loading context
  • using workflow plugins like gsd, super powers, compound engineering
  • Writing it in memory
  • Adding hooks that run automatically
  • Adding guardrail scripts
  • Adding checkpoint commands
  • Yelling at it
  • Documenting the violations in a failure log append only markdown file so it can see its own history I have switched to codex for execution. Very sad. I used to love claude, I loved the skills and workflows. Now it’s pretty much unusable. It’s overconfident and too eager which just makes it even more frustrating. If this keeps up I’m going to have to cancel my max plan, people here say “oh you just have to wait it out it means new updates are coming” but updates happen multiple times per week, and I feel cheated out when these “down periods” are going to stretch for more than a week or two weeks, that’s basically the same as paying $200 per 2 weeks instead of a month. submitted by /u/Alarmed_Sky_41

Originally posted by u/Alarmed_Sky_41 on r/ClaudeCode