Original Reddit post

I’ll put the full write up for this in the comments. I just closed the loop on my effort to remove myself from the entire development process end to end using multiple agents and GitHub accounts with an MCP and Skill that can use my software from Claude Code so now I’m just monitoring the project dashboard while everyone works. I stress in my blog post on this that this works for me because I spent a year heads down on a really novel and scalable architecture and all of the code being writing by an agent follows my patterns and designs. This will produce a pile of garbage without that and 30 yoe going into it. I have two distinct users, each with their own email on my domain and GitHub accounts in my org. Really important to understand GitHub’s TOS on bot accounts and name them correctly with *-bot, make them private and be very clear they aren’t people - GitHub will delete you if you screw this up. They’re named Blue and Ghost (after the shrimp). Blue is a developer, Ghost is a tester / QA. Blue can open pull requests and comment on issues, Ghost can open and comment on issues. I have several servers running that are setup with either Blue or Ghost’s ssh keys, git config and fine grained access tokens. Each server is running a self hosted GitHub action runner with a label like - smoke, qa, dev, feature. Here’s the flow - it’s all driven by GitHub issues, actions and agents: I create spec and assign work to Blue One Blue server is polling git looking for work Blue picks up a dev task, and opens a PR labeled for QA A GHA is set to run when a PR is opened like that by Blue The action runs on a smoke test machine checking out the branch and doing a full build and install then runs Claude code with instructions for Ghost Ghost will either pass the PR or make comments which Blue will pick up. Ghost uses the MCP and skill to run a series of tests and test what the PR says. If it’s a bug fix, Ghost filed the original and therefor knows how to confirm. Ghost opens issues if he see a bug, another instance of Ghost is looking for those and will use git blame and perform a post mortem on the regression, updating our docs so it doesn’t happen again and Blue gets better. All of the agents run from specific scripts for their role out of a centralized repo of lessons and md files. If anyone experiences friction or had to figure something out that would have been easier if that repo, the skill, or MCP was improved it files an issue to that repo and another Blue instance picks it up to improve. SonarQube, Lint, Copilot all scan for issues and open work for Blue Another Ghost instance does a full end to end test as a user, with no knowledge of the code base just the MCP - files issues or green lights a merge to main. The trick is the different users and agent rolls, driving everything with issues, and the MCP and Skill that knows how to use my software both the agents use and my users get. The feedback loop of constant improvement is key. Call me crazy but also I have a pro account and this is all very cheap token wise - I think that’s because the work is atomic and stateless so token usage is low for each issue. Skills and instructions are role specific so agents don’t need my entire skill set to work. After 30 years of coding and mastering my craft, I’m kicking back with the dogs and watching this on my tablet - which I think was the whole idea. It’s fun watching my boys go at it.

  • cheers submitted by /u/Ok_Cartographer_6086

Originally posted by u/Ok_Cartographer_6086 on r/ClaudeCode