I am a software engineer with 15 years of experience in game development, mostly graphics, physics, and engine programming. I use AI while doing my tasks most of the time in one way or another. Most of my AI usage patterns are information search, short brainstorms - rubber-ducking with the AI, not really reading the output much - occasional code reviews from the AI side to catch some issues here and there, and making small use-and-forget tools that are needed right now. So I can’t be considered an agentic coder, nor can I consider myself a vibe coder, but it would be unfair to say I don’t have any AI experience. Recently, my wife, who isn’t a coder herself, decided to code a small Python app for her own needs using Claude Code. I won’t go into much detail, but the app is basically a data-crunching machine with very little UI, so it is very hard to see whether things are going right just by looking at the result. At first, she was really excited about the pace she had and how helpful Claude was, but after a while she started to notice that something was off here and something was off there. Digging into the problem seemed to fix one issue, but then others started to pop up. After a while, she discovered that the core logic was completely wrong. We thought, “It’s probably because she is a non-coder, so she can’t wield the tool properly due to lack of experience.” So I thought I could give this agentic coding thing a shot and see how good these tools are. My plan was simple: collect the useful discoveries about her project into a nice form that I would use next. I spent something like two days doing research, writing the architecture document and prompts that I was going to use for the greenfield reimplementation of the project. I was quite meticulous in describing the desired architecture, requirements, structures, and results, alongside writing down all the edge cases I knew of. My expectations were quite high. I thought I was actually going to make it work quite fast. If I were to give my document to a junior dev, they would probably produce code that wasn’t the best, but still a project that worked. After starting a new session with all this preparation, I was pleased with how fast it was going. But after the initial stage finished, I reviewed the code and found a lot of things that even a newbie junior probably wouldn’t do. There were multiple constants here and there that were supposed to be the same across the code, even though my documentation explicitly stated that there should only be one source of truth for such things. The simulation path and the actual working code path - it has two modes: re-simulating the past using existing historical data and actually working in real time - were basically duplicates of each other. The worst part was that the duplicates weren’t exact. Again, I had clearly stated in my docs that I wanted them to be as close as possible and to use the same abstractions. My first thought was, “It’s probably me doing something wrong, but it’s fixable.” Then the cycle of pain and fixes started. The project wasn’t extremely huge, but I wanted to try this approach that a lot of people promote, where they don’t write the code. The issue with it is that you either trust the machine and don’t review the code much, or it defeats the purpose, because the time you need to spend understanding the system and the code behind it is often more than I would spend writing it myself. Of course, you should also review the code you write yourself, but we can probably all agree that it is an easier task. My approach was simple: I wanted to make sure the core was working and then proceed to expanding the functionality. Despite basically writing no code and only querying Claude about how it had implemented this and that, and guiding it in fixing things, it was extremely exhausting. I never knew where the system was correct and where implementation mistakes had been made. Since it was just the core, not much proper testing could be done. I was just sitting there, doing nothing, and feeling how draining the experience was. Claude made one mistake after another. Sometimes it broke old code. Sometimes extremely stupid things surfaced that no reasonable person would ever do, like simulating things on much smaller timescales while only having data for larger timeframes. After a few days of fighting the machine, I got something that I could call a working program. Despite not writing a single line of code, I felt devastated and exhausted. Never in my life have I felt so bad about making software. Taking everything into consideration, I really don’t know how people use coding agents in this mode. I am sure that if the thing you are trying to do is really boilerplate-heavy and doesn’t have any complex logic, then you can probably one-shot it. But I feel like writing it the old-school way would probably have been faster, considering I spent a total of five days on this experiment. Don’t get me wrong: I do feel some net gains from AI. The ease of obtaining information and examples, alongside small bits of boilerplate here and there, makes my life easier. But making a whole project with AI is just a miserable experience, because you can never trust what it wrote, and you need to think ten times harder just to catch what it might have done wrong. submitted by /u/AliorUnity
Originally posted by u/AliorUnity on r/ClaudeCode
