if everyone can compare models instantly like use.ai, do benchmarks still matter?

www.reddit.com

if everyone can compare models instantly like use.ai, do benchmarks still matter?

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 12 days ago

Original Reddit post

benchmarks dominate most ai discussions, but real users don’t work in benchmark conditions. tools that let people run the same prompt across multiple models and judge outputs directly, in context, for real tasks. that feels closer to actual usage than leaderboards. should evaluation shift more toward side-by-side real work comparisons, or are benchmarks still the only meaningful signal at scale? submitted by /u/Life-Strategy4490

Originally posted by u/Life-Strategy4490 on r/ArtificialInteligence

You must log in or # to comment.

Chat