I built a Claude Code plugin that validates startup ideas: market research, competitor battle cards, positioning, financial projections, go/no-go scoring. The interesting part isn’t what it does. It’s the multi-agent architecture behind it. Posting this because I couldn’t find a good breakdown of agent patterns for Claude Code skills when I started. Figured I’d share what actually worked (and what didn’t). The problem A single conversation running 20+ web searches sequentially is slow. By search #15, early results are fading from context. And you can’t just dump everything into one massive prompt, the quality drops fast when an agent tries to do too many things at once. The solution: parallel agent waves. The architecture 4 waves, each with 2-3 parallel agents. Every wave completes before the next starts.
Wave 2: Competitive Analysis (3 agents) Competitor deep-dives + substitutes + GTM analysis
Wave 3: Customer & Demand (3 agents) Reddit/forum mining + demand signals + audience profiling
Wave 4: Distribution (2 agents) Channel ranking + geographic entry strategy ```
Each agent runs 5-8 web searches, cross-references across 2-3 sources, rates source quality by tier (Tier 1: analyst reports, Tier 2: tech press, Tier 3: blogs/social). Everything gets quantified and dated.
Waves are sequential because each needs context from the previous one. You can't profile customers without knowing the competitive landscape. But agents within a wave don't talk to each other, they work in parallel on different angles of the same question.
5 things I learned
1. Constraints > instructions.
"Run 5-8 searches, cross-reference 2-3 sources, rate Tier 1-3" beats "do thorough research." Agents need boundaries, not freedom. The more specific the constraint, the better the output.
2. Pass context between waves, not agents.
Each agent gets the synthesized output of the previous wave. Not the raw data, the synthesis. This avoids circular dependencies and keeps each agent focused on its job.
3. Plan for no subagents.
Claude.ai doesn't have the Agent tool. The skill detects this and falls back to sequential execution: same research templates, same depth, just one at a time. Designing for both environments from day one saved a painful rewrite later.
4. Graceful degradation.
No WebSearch? Fall back to training data, flag everything as unverified, reduce confidence ratings. Partial data beats no data. The user always knows what's verified and what isn't.
5. Checkpoint everything.
Full runs can hit token limits. The skill writes
PROGRESS.md
after every phase. Next session picks up exactly where it stopped. Without this, a single interrupted run would mean starting over from scratch.
What surprised me
The hardest part wasn't the agents. It was the intake interview: extracting enough context from the user in 2-3 rounds of questions without feeling like a form, while asking deliberately uncomfortable questions ("What's the strongest argument against this idea?", "If a well-funded competitor launched this tomorrow, what would you do?"). Zero agents. Just a well-designed conversation. And it determines the quality of everything downstream.
The full process generates 30+ structured files. Every file has confidence ratings and source flags. If the data says the idea should die, it says so.
Open source, 4 skills (design, competitors, positioning, pitch), MIT license:
ferdinandobons/startup-skill
Happy to answer questions about the architecture or agent patterns. Still figuring some of this out, so if you've found better approaches I'd love to hear them.
submitted by
/u/ferdbons
Originally posted by u/ferdbons on r/ClaudeCode
