eifachposte

eifachposte

One of the biggest repeated arguments I hear against agentic development is that it is not deterministic. And honestly, I do not really dispute that point. But I do think it is worth pointing out that very few resources we rely on in life are fully deterministic. We still rely on plenty of probabilistic systems every day. Humans are probably the most obvious example. They are inconsistent, context-sensitive, occasionally wrong, and still somehow responsible for making most of the world function. To me, one of the strongest indicators that a developer will be able to use agentic development effectively is not whether they expect the agent to be perfect. It is whether they are good at building deterministic validation around the work they want done. For any task you want an agent to solve, you eventually need tools and mechanisms that let both humans and agents check the work. Those checks will almost never be perfect up front. They will not capture every edge case or every stakeholder expectation on day one. But once you have a structure where criticism can be turned into validation, you create a feedback loop that keeps pushing the work closer to the target. In my own workflow, I try to lean heavily on behavior-driven, spec-driven, and test-driven development. For every feature, bug, requirement, or vague request that comes in, my first question is usually: “How do I prove this is correct?” Most of the time, that means some combination of unit tests, integration tests, and end-to-end tests. After that, my next priority is getting the work in front of actual users or stakeholders as quickly as possible. Once real people see it, there are almost always criticisms. Some are functional. Some are visual. Some are behavioral. Some are performance-related. Some are about code style, maintainability, edge cases, naming, or just the fact that what we built does not quite match what they had in their head. I try to treat those complaints as raw material. From there, I usually sort them into categories and ask how each one could be captured next time. Does this need a test? An analyzer? A linter? A validation script? A source generator? A Claude hook? A Claude skill? A git hook? Some other piece of meta-development tooling? The goal is not to eliminate human judgment. The goal is to stop rediscovering the same problem manually over and over again. Over the last couple years of working with agentic development, I have found that my outcomes across providers, models, and languages are often more similar than people might expect. There are obvious differences in speed, cost, token usage, reasoning quality, and how much hand-holding each model needs. But when the tools, constraints, and validation criteria are in place, most of them can eventually land on something close to the correct solution. That has shifted how I think about the whole thing. The model matters, but the validation loop matters more. submitted by /u/jdsfighter

Originally posted by u/jdsfighter on r/ClaudeCode

Agentic Development Is Probabilistic. The Validation Loop Shouldn’t Be.

Agentic Development Is Probabilistic. The Validation Loop Shouldn’t Be.