Scientists built the hardest AI test ever and the results are surprising | ScienceDaily

www.sciencedaily.com

Scientists built the hardest AI test ever and the results are surprising | ScienceDaily

www.sciencedaily.com

eifachposteMB to AI (Reddit RSS)English · 13 hours ago

Scientists built the hardest AI test ever and the results are surprising

www.sciencedaily.com

As AI systems began acing traditional tests, researchers realized those benchmarks were no longer tough enough. In response, nearly 1,000 experts created Humanity’s Last Exam, a massive 2,500-question challenge covering highly specialized topics across many fields. The exam was engineered so that any question solvable by current AI models was removed. Early results show even the most advanced systems still struggle — revealing a surprisingly large gap between AI performance and true expert-level knowledge.

Original Reddit post

As artificial intelligence systems began scoring extremely high on long used academic benchmarks, researchers noticed a growing issue. The tests that once challenged machines were no longer difficult enough. Well known evaluations such as the Massive Multitask Language Understanding (MMLU) exam, which had previously been seen as demanding, now fail to properly measure the capabilities of today’s advanced AI models. To solve this problem, a worldwide group of nearly 1,000 researchers, including a professor from Texas A&M University, developed a new type of test. Their goal was to build an exam that is broad, difficult, and grounded in expert human knowledge in ways that current AI systems still struggle to handle. submitted by /u/PixeledPathogen

Originally posted by u/PixeledPathogen on r/ArtificialInteligence

You must log in or # to comment.

Chat