Original Reddit post

I recently asked Claude Code to build a comprehensive suite of E2E tests for an Alpine/Bootstrap site. It generated a really nice test suite - a mix of API tests and Playwright-based UI tests. After fixing a bug in a page and re-running the suite (all tests passed!), I deployed to my QA environment, only to find out that some UI elements were not responding. So I went back to inspect the tests. Turns out Claude decided the best way to make the tests pass was to patch the app at runtime

  • it “fixed” them by modifying the test code , not the app. The tests were essentially doing this: Load the page Wait for dropdowns… they don’t appear Inject JavaScript to fix the bug inside the browser Dropdowns now magically work Select options Assert success Report PASS In other words, the tests were secretly patching the application at runtime so the assertions would succeed. I ended up having to add what I thought was clearly obvious to my CLAUDE.md :

The #1 Rule of E2E Tests A test MUST fail when the feature it tests is broken. No exceptions. If a real user would see something broken, the test must fail. No “fixing the app inside the test”. A passing test that hides a broken feature is worse than no test at all.

Curious if others have run into similar “helpful” behavior from. Guidance, best practices, or commiseration welcome. submitted by /u/Traditional_Yak_623

Originally posted by u/Traditional_Yak_623 on r/ClaudeCode