Everyone here is exited no how good Claude works , and I get it. all my colleagues are developers and there are doing things at amazing speeds. I’m a DevOps , mostly about automation and infrastructure and I can’t make Claude write stuff in the way I want , for the life of me. As a practical example : I asked to create a technical documentation describing the behavior of our custom deployment “script” . For context: this “script” deploy our application directly to customers … application is based on micro-services , in which each service has its own features toggle ( ~ 15 as average each service) plus based on which high level features the customer wants or doesn’t , not all micro-services will be deployed all the time. As a result , the deployment script as ~250 arguments . The purpose of the documentation is to describe what will be the end result for the customer for a change in each of the initial arguments ( consider it as a very complex flow-chart if you want ). Because the “script” is so complex , there is no way to make a manual review of the final document so as a guardrail I asked Claude to “read the full documentation, not taking shortcuts” . Pay attention: I didn’t really “asked” it to do it . I optimized a prompt with a skill to make a plan to enforce this concept. I reviewed the optimized prompt to make the plan 4 times: in it there was the constraint , ~10 bash commands that are shortcuts ( grep , ed , sed … the work ) to avoid plus technical details how the plan should be written . First plan version , it wrote the constrain into it and in another section a list of 4 specific files that need to be read ( over ~40 ) … after 6 iteration to write the plan , where Claude one after the other wrote all possible loopholes to jump the “read the full code” instruction , I had a plan with:
- the constraints written as mandatory
- a table with high level requirement that will need to be satisfied ( that will need to be checked while reading the code / writing the final doc)
- a second table with a full list of files that need to be read (actual files absolute paths) with a check in which it has to be written how each of the file was read
- a “Definition of done” written literally as “6. GOAL (Definition of Done) —to be interpreted as Claude Code /goal option” At this point , I let Claude start to actually write the documentation. It decided to run agents , fine. At iteration 1, the prompt said: "agents forgot to fill the constraint checkbox , should I do it ? " and I though myself … this is a good beginning … At iteration 4, the prompt directly said: “I finished. I didn’t read this files (with the list) because I decided they were no relevant. I updated the plan with this note” At iteration 6, it confirmed all my code was fully written. by now … 5 hours were passed. Then I opened another Claude code prompt , and I asked to check the documentation produced by the previous flow ; it found ~10 initial arguments where described wrongly , coming from 4 different files in the code. I GAVE UP How can i trust the documentation that was made? Could it be caused by missing human skill in handling Claude Code ? maybe . But I really ask you, as a game : you try to do the same task ( for real , not hypothetical), can you make Claude do it ? if you actually succeed … please , explain me how!!! I can share the reviewed prompt and the reviewed plan if you want ( they are in Italian and I’ll redact sensible information ) P.S. I’m not asking to do the task for me, the task is already done … I’m frustrated because I can’t trust the end result , for obvious reasons … how can I trust it if after all the guardrails and iterations , a new fresh prompt still find errors? should I run 1000 more prompts over it to see if all say that’s good? submitted by /u/eltear1
Originally posted by u/eltear1 on r/ClaudeCode
