The meta moved from AI writing code you still had to debug to how many agents you can spin up (and manage) that write even more code you still have to validate. It can one-shot a lot of simple stuff but 85% accuracy on a complicated app spread over dozens of modules still isn’t good enough (AFAIC). I built this for myself and use it almost every day. It’s finally at a point where I’d like to get some feedback. https://github.com/briankelley/devils-advocate From the repo readme: Do you have an implementation plan, codebase, or spec created by Claude, GPT, Gemini, Grok, etc and you want the flagship model from competing frontier providers to rip it apart, exposing the holes in logic and potential coding landmines, before a single line of code gets written? Dashboard Config Review Details submitted by /u/i_am_fear_itself
Originally posted by u/i_am_fear_itself on r/ClaudeCode
