After a lot of back and forth I landed on a workflow that has been working really well for me: Claude Code with Opus 4.6 for planning and writing code, Codex GPT 5.4 strictly as the reviewer. The reason is not really about which one writes better code. It’s about how they behave when reviewing. When GPT 5.4 reviews something Opus wrote, it actually goes out of its way to verify things, whether the logic holds, whether the implementation matches what’s claimed, whether the assumptions are solid. And it keeps doing that across iterations. That’s the key part. Say you have this flow: GPT writes a doc or some code I send it to Opus for review Opus finds issues, makes annotations I send those back to GPT/Codex to fix Then back to Opus for another pass What I notice is that Opus does verify things on the first pass, but on the second round it tends to “let the file go.” Once the obvious stuff was addressed, it’s much more willing to approve. It doesn’t fully re-investigate from scratch. GPT 5.4 doesn’t do that. If I send it a second pass, it doesn’t just assume the fixes are correct because they addressed the previous comments. It goes deep again. And on the next pass it still finds more edge cases, inconsistencies, bad assumptions, missing validation, unclear wording. It’s genuinely annoying in the best way. It keeps pressing until the thing actually feels solid. It does not “release” the file easily. This isn’t me saying Opus is bad, actually for building it’s my preference by far. It hallucinates way less, it’s more stable for actual production code, and it tends to behave like a real developer would. That matters a lot when I’m working on projects at larger companies where you can’t afford weird creative solutions nobody will understand later. GPT 5.4 is smart, no question. But when it codes, it tends to come up with overly clever logic, the kind of thing that works but that no normal dev would ever write. It’s like it’s always trying to be impressive instead of being practical. For planning it’s a similar dynamic. Codex is great at going deep on plans, but since Opus isn’t great at reviewing, I usually flip it: Opus makes the plan, Codex reviews it. submitted by /u/r4f4w
Originally posted by u/r4f4w on r/ClaudeCode
