Original Reddit post

There is an interesting divergence happening in how AI labs are approaching the reliability problem, and the two camps are genuinely incompatible. I want to lay out both bets and see what people think about which one scales. One camp, call it the smarter model bet, says that as foundation models get stronger, the need for complex external engineering around them goes down. The argument is that you should invest in a smarter base model, not in elaborate wrappers around a weaker one. Kimi’s recent talk framed this as Loop Engineering instead of Harness Engineering, but the idea is older than that. Google, OpenAI, and Anthropic have all made versions of this argument at various points. If the model is smart enough, it can catch its own mistakes. The other camp, call it the verification bet, says no model, no matter how smart, can reliably catch its own blind spots. The same blind spot that produced the error is the one doing the review if you let the model check its own work. So you need verification that is structurally separate from generation. A separate system that did not produce the answer checks the claims against fresh sources. Several recent launches are building this directly into their architecture, with a separate verification team that never touches the original reasoning trace. Other labs are doing related work, running independent critic models or multi-source cross checking. Apodex is one of the clearer articulations of this approach. The smarter model bet is elegant. If it works, you just train a better model and the problem goes away. But the history of AI safety is not kind to the just make it smarter argument. GPT-4 was smarter than GPT-3.5 and hallucinated less per token, but on long tasks the total hallucination count went up because the model was more confident and users trusted it more. A smarter model that is confidently wrong is more dangerous than a dumber one that signals uncertainty. The verification bet is messier. It costs more, it is slower, it is architecturally complex. But it makes a specific prediction that is falsifiable: if you take the same model and add independent verification, the reliability gain is measurable and larger than what you would get from a parameter bump. Whether that generalizes outside controlled benchmarks is the real question and nobody has the answer yet. What I think is actually going on is that both bets can be true at different scales. For a short question, a smarter model with no verification is probably fine. For a long multi step research task where the context window is saturated and the model has been reasoning for thousands of steps, the probability that it has made an error somewhere is close to 1. At that scale, betting the model can catch its own error is asking a lot. At that scale, you need the verification camp’s approach. The interesting thing to watch is whether these two approaches converge. If the next generation of foundation models internalizes verification behavior in training, then the distinction between smarter model and external verification collapses. But until that happens, the two bets are producing two different kinds of AI systems, and the reliability numbers are starting to diverge in ways that matter for anyone deploying these things in production. Which camp do you think is right about where this converges? submitted by /u/Jealous-Leek-5428

Originally posted by u/Jealous-Leek-5428 on r/ArtificialInteligence