https://youtube.com/watch?v=5MO3sy2QN-g That’s 95% relative to the second best human. It means the AI took 1.026 actions for every 1 action the second best human took to beat the games. (1/1.026)^2 = 0.95. And thats despite the flaws in the benchmark: Former OpenAI researcher (who worked on OpenAI Five that beat Dota 2 champion) and competitive coding champion shows the glaring flaws and biases of ARC-AGI-3 https://xcancel.com/FakePsyho/status/2037279261267038657?s=20 https://xcancel.com/FakePsyho/status/2036891649079439525 I also dont think a harness is bad to use in the same way humans are allowed to use prescription glasses or high level programming languages to help them see and build software. AGI can be LLM + harness. It doesn’t have to be LLM alone. And of course, there’s no way any of the games are in the training data of the LLMs yet. submitted by /u/Tolopono
Originally posted by u/Tolopono on r/ArtificialInteligence
