I got Claude to make a web app, and then asked it to perform its own manual QA on the app using the Playwright MCP, taking screenshots, navigating around, etc. Claude performed more than 20 different QA scenarios and declared the app working as intended. It took me less than 10 seconds to find serious bugs in this code that would be obvious to any human viewer – UI issues, text missing, obviously inconsistent state, etc. Has anyone had success getting Claude to actually find & fix bugs? I’m honestly shocked at how consistently bad it is at this kind of operation. submitted by /u/thurn2
Originally posted by u/thurn2 on r/ClaudeCode
You must log in or # to comment.
