what the best or go-to approach is for using AI agents like Claude Code or Codex when working on large applications, especially for major updates and refactoring. What is working for me With AI agents, I am able to use them in my daily work for: Picking up GitHub issues by providing the issue link Planning and executing tasks in a back-and-forth manner Handling small to medium-level changes This workflow is working fine for me. Where I am struggling I am not able to get real benefits when it comes to: Major updates Large refactoring System-level improvements Improving test coverage at scale I feel like I might not be using these tools in the best possible way, or I might be lacking knowledge about the right approach. What I have explored I have been checking different approaches and tools like: Ralph Loop (many people seem to have built their own versions) e.g https://github.com/snarktank/ralph https://github.com/Fission-AI/OpenSpec https://github.com/github/spec-kit https://github.com/obra/superpowers https://github.com/gsd-build/get-shit-done https://github.com/bmad-code-org/BMAD-METHOD https://runmaestro.ai/ But now I am honestly very confused with so many approaches around AI agents. What I am looking for I would really appreciate guidance on: What is the best workflow to use AI agents for large codebases? How do you approach big refactoring OR Features Planning / Execution using AI? What is the best way to Handle COMPLEX task and other sort of things with these Agents. I feel like AI agents are powerful, but I am not able to use them effectively for large-scale problems. What Workflows can be defined that can help to REAL BENEFIT. I have defined
- Slash Command
- SKILLS (My Own)
- Using Community Skills But Again using in bits and Pieces (I did give a shot to superpowers with their defined skills) e.g /superpowers:brainstorming <CONTEXT> it did loaded skill but but … I want PROPER Flow that can Really HELP me to DO MAJOR Things / Understanding/ Implementations. Rough Idea e.g (Writing Test cases for Large Monolith Application)
- Analysing -> BrainStorming -> Figuring Out Concerns -> Plannings -> Execution Plan (Autonomus) -> Doing in CHUNKS e.g e.g. 20 Features -> 20 Plans -> 20 Executions -> Test Cases Per Feature -> Validating/Verifying Each Feature Tests -> 20 PR’s -> Something that I have in my mind but feel free to advice. What is the best way to handle such workflows. Any advice, real-world experience, or direction would really help. submitted by /u/khizerrehan
Originally posted by u/khizerrehan on r/ClaudeCode
