Are there any benchmarks that confirm opus 4.7 has regressed wrt 4.6?

www.reddit.com

Are there any benchmarks that confirm opus 4.7 has regressed wrt 4.6?

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 3 hours ago

Original Reddit post

been using Claude code for a while now, and lately Opus 4.7 seems to be acting up, not following prompts or straight denies work saying “its a multi day effort that i won’t touch this session” this wasn’t always the case and I somehow feel that Opus 4.6 was much better. But all the benchmarks on artificialanalysis.com still show Opus 4.7 xhigh as the best model. Has anyone found or run a benchmark that actually shows a regression wrt Opus 4.6 ? submitted by /u/acertainmoment

Originally posted by u/acertainmoment on r/ClaudeCode

You must log in or # to comment.

Chat