Github link: https://github.com/nikhilsitaram/claude-caliper Hello. There will be no AI slop in this post. Because I am a grown man taking time out of my Sunday afternoon to type this on my real keyboard. I have been addicted to claude code for some time now and found a great middle spot between the native plan mode (which is woefully underprepared for anything more than 5 steps long), and these crazy workflows with 50 AI agents, 20 commands, 10 MCP tools, and some weird personality on top of it (probably called Jarvis or something). I used Superpowers as inspiration and followed their basic workflow, but fleshed it out significantly. The key thinking behind this workflow:
- KISS: Skills under 1000 words. Only 8 workflow skills and 2 tooling skills. Hooks handle permissions for you. Like why does Superpowers have skills that are 3000 words long with TDD examples and a separate skill for using git worktrees? Modern claude knows how to do all of that out the box.
- Don’t get in Claude’s way. I’m not putting claude into a box to follow an exact workflow. The skills are just there to guide it along and make sure it follows actual success criteria.
- As little human interaction as possible once design is approved. You go through the design spec with claude, it creates success criteria which are hard coded into json files and then creates a spec and plan for subagents to follow. Reviews are done every step of the way and if >5 issues arise, it re-runs review until clean. You start with an idea, approve it, then it does all the work and creates a PR.
- No agent ever reviews or confirms its own work. Claude will very confidently boast at what a great job it did when its finished. But its always hilarious that when I run a subagent to check its work it always comes back with issues that the main agent immediately admits needs a fix. LLM decisions are re-reviewed until clean and design specs are compared against hard-coded json deterministically.
- Context engineering. Every step is done as a subagent with no context provided outside of what is absolutely necessary to get the job done. Phase agents get only that phase and handoff notes, task agents get only that task. Review agents only get the spec and the git diff. The hierarchy of subagents is then checked at higher and higher levels until the total output matches exactly what you intended in the design. Tooling skills:
- skill-eval: This runs headless claude sessions with dummy prompts trying to poke holes in your skill. Allowing you to truly A/B test skills to see what kind of verbiage is better and what isn’t necessary. This could honestly be its own repo. Really cool
- codebase-refactor: Looks at your entire codebase or certain dir you point it to and takes a top down look for coding standards, DRY, YAGNI, etc. Permissions:
- Hooks contain safe read-only commands and falls back to auto mode. Any new command that doesn’t fit the safe list will be stored and the user can choose to add to the list or modify it manually. So much better not getting permission prompts but also not using skip permissions dangerously. Anyway feel free to check it out. Or don’t - fuck it. I’m open to feedback. If you try it out and see some holes create a github issue. Link: https://github.com/nikhilsitaram/claude-caliper submitted by /u/The_Hindu_Hammer
Originally posted by u/The_Hindu_Hammer on r/ClaudeCode
You must log in or # to comment.
