Original Reddit post

When u/karpathy described the strange shape of modern AI capability, he used a useful word for it. The idea is that the surface of what a model can do is not smooth, the way human ability is roughly smooth, but uneven, with sharp peaks of near-superhuman performance rising directly next to valleys of embarrassing failure. The classic demonstration is to ask a frontier model how many days of the week contain the letter d, and watch it try. Sometimes it answers four. Sometimes six. The answer is seven, because every day of the week ends in “day”, which a five-year-old can see in a single glance. The same model, on a different turn, might find a 27-year-old vulnerability in OpenBSD, an operating system whose entire reputation is built on three decades of paranoid code review, and which no human researcher in those three decades had managed to notice was broken. That is what jagged means. The intelligence is real, and the surface of it bears almost no resemblance to the contours of human ability. Most of the conversation since the term was coined has stayed at the level of the model, comparing GPT against Claude or Gemini against Grok and mapping the terrain by benchmark, as if the question were which model is generally smarter rather than where each model’s spikes happen to point. Building an attack harness has changed how I see that map, because the jaggedness lives at more than one level, and the level it lives at most powerfully is the one that almost nobody is talking about. The picture I keep coming back to is a wheel with spokes. Each spoke is a direction in capability-space where some combination of people, capital, and data has been invested. Some spokes grew from the model side, by accident or on purpose. Some spokes grew from the harness side, where a team took a generalist model and built the exact scaffolding their domain needed. The durable products of this era will mostly be the combination of both, a model with a natural lean toward the relevant axis paired with a harness that knows how to climb it. Coding is a spike. Legal is a spike. Protein structure is a spike. Clinical reasoning is a spike. Offensive security is a spike. Each of them gets taller every quarter. The reality is though, you do not need to be a frontier lab to sit on the tip of one of these spokes. You need a model with the right natural lean, which is now a commodity available by API, and a harness built by people who know the target domain cold. That is a small team of the right engineers with conviction and a clear thesis about where the spike points. A group of five people, regardless of their moral standing, can climb to the pointiest end of one of these spokes faster than the institutions built to defend against them can react. AI is the great equaliser, and it equalises specifically at the harness layer. The model is the public good, accessible to everyone for roughly the same price. So in my opinion, the harness is where the asymmetry lives, and the harness costs almost nothing to build relative to what it can do once built. Cybersecurity is the cleanest case study for this asymmetry, because the field has more than twenty years of public history showing how the contest between attack and defence plays out under normal conditions. On the defensive side, the industry spent those two decades building infrastructure: endpoint detection and response systems that watch every process on every machine, security information and event management platforms that aggregate logs from across an enterprise, the slow shift toward zero-trust architectures that assume any given network connection is hostile by default, threat intelligence sharing arrangements between companies and governments, mandatory breach disclosure laws, bug bounty programmes that pay researchers to find flaws before criminals do, and the long professionalisation of the security workforce itself. On the offensive side, attackers spent the same two decades under continuous evolutionary pressure, finding new techniques when their old ones got patched and falling back on the old ones whenever defenders failed to learn the lessons of the previous decade, which they routinely did. The equilibrium that emerged was an uneasy one. submitted by /u/theonejvo

Originally posted by u/theonejvo on r/ArtificialInteligence