honestly getting so exhausted by the narrative that if we just throw enough gpus and data at an autoregressive model it will eventually wake up and truly understand formal math like sure, it can spit out a react component just fine. But the second you need absolute correctness with zero partial credit, the whole next-token prediction facade shatters. I was reading up on how systems like Aleph are clearing these massive formal reasoning benchmarks right now, and the underlying tech literally has to rely on strict mathematical verification instead of just guessing the most plausible sounding string of text We are absolutely deluding ourselves if we think standard llms are going to safely run critical infrastructure without the industry fundamentally changing how these architectures verify their own logic first submitted by /u/Kilgoretrout123456
Originally posted by u/Kilgoretrout123456 on r/ArtificialInteligence
