Original Reddit post

Before starting: I keep things very real. I’m dropping sauce but also linking my SaaS (at the end) but for showcasing purposes, not promoting… so you can see that I turn my words into action, not just fluff. I’ve been posting tips here for months but never told the full story. So here’s everything I learned going from “let Claude build it” to actually running a SaaS in production with real users, real payments, and real infrastructure. Built entirely with Claude Code. This isn’t theory. I shipped a product with 3 databases, multiple services, Stripe billing, Discord bot integration, web widgets, and a custom retrieval pipeline. All with AI. Here’s how it actually went down. Phase 1: I was the problem, not Claude First few weeks I blamed Claude for everything. Trash code, wrong assumptions, half broken features. Then I realized something that changed everything: Claude amplifies whatever you give it. Feed it confusion, get confusion back. Feed it clarity, get clarity back. The single biggest unlock was planning BEFORE opening the terminal. I’m talking feature specs, database schemas, file structures, screenshots of what I want. Everything. I started spending 60-70% of my time on planning and 30% on actual prompting. Output quality went through the roof overnight. XML formatted prompts made a massive difference too. LLMs parse structured data natively so instead of writing paragraphs I started wrapping everything in tags. Felt weird at first but the results speak for themselves. Phase 2: CLAUDE.md hit a wall, but hooks saved everything ngl Like everyone here I started with a massive CLAUDE.md file. Worked great for a week. Then the context window fills up and Claude basically forgets everything you put in there. I was pulling my hair out. Switched to hooks and that one move solved roughly 80% of my problems. The big one is a UserPromptSubmit hook that runs before every single message. It does two things: scores task complexity using weighted regex patterns, and routes to the right specialist agent based on keywords. Simple stuff like “fix typo” scores low and Claude goes quick mode. “Refactor auth migration” scores high and triggers deep analysis with structured thinking before touching any code. No ML, no embeddings, no API calls. Just regex and weights. Runs in under 100ms. Then I stacked more hooks on top. A PostToolUse hook on Write/Edit that forces a 4 step review chain (simplify, self critique, bug scan, prove it works) before Claude gets to say “done”. Another PostToolUse on Bash that catches git commits and makes Claude reflect on what went right and wrong, saving lessons to a reflections file. Then a separate hook pulls from that file and feeds relevant lessons back into the next prompt. The cycle: commit, reflect, save the lesson, feed it back next session. After a couple weeks my reflections file had 40+ entries and Claude genuinely stopped repeating mistakes that cost me hours before. Lowkey the best part of the whole system. Phase 3: Architecture is where most people fail This is the part nobody talks about. Claude Code will happily build you a working prototype that falls apart the second real users touch it. Scaling and security are the two biggest gaps. For architecture I developed a process: Ask Claude to deep review your codebase and generate a doc explaining current state vs expectation Send that doc + a generated prompt to Claude deep research mode to help design the target architecture Split every piece into separate .md files explaining implementation from A to Z Create specialized architecture agents Implement very slowly, running tests before moving to the next piece For security I literally approach my own app like a pentester. Rate limiting from day one (100 req/hour per IP, start strict). RLS enabled on every table in Postgres so users can only see their own data. API keys in secret managers, never in code. CAPTCHA on every form. HTTPS everywhere. Input sanitization on both frontend and backend. Dependencies updated monthly minimum, security patches same day. I’ve seen apps get hammered with 10k+ fake registrations in minutes. One bad actor finding exposed credentials can cost you everything. This stuff isn’t optional if you’re handling real user data. Phase 4: Token management I’m on the 20x plan and have never hit limits despite running 3-4 terminals in parallel daily. Here’s how: Never let Claude compress conversations. Start fresh instead, its way cheaper. Don’t use thinking/ultrathink mode unless the task genuinely demands it. Remove MCP servers you’re not actively using (supabase+github+chrome devtools sitting idle eats ~75k tokens). Keep CLAUDE.md under 100 lines. Use specialized agents (they consume independent tokens, not your main session). Drag screenshots into the CLI instead of writing paragraphs describing UI bugs. Ask Claude to respond concisely and directly. And the big one: at 50% token limit, start a new session. Compaction progressively degrades output quality. If you’re mid-refactor, clear context and point Claude to the .md files you generated earlier. Way better than letting it try to compress everything. Phase 5: The refactoring system For any file, no matter how complex: Ask CC to generate a doc explaining how the file is used in your codebase Ask CC to generate a detailed plan explaining how the refactor would happen step by step (without implementing) Take both docs + the file to Claude Desktop (Sonnet 4.5 for smaller refactors, Opus for bigger ones) and ask it to generate the prompts for Claude Code to execute Feed each prompt back into CC with a custom refactor command Last prompt is always a deep review This separates planning from execution and prevents Claude from going off the rails mid-refactor. What I shipped The product is a knowledge base chatbot SaaS ( bestchatbot.io ) with Discord bot integration, embeddable web widgets, a Q&A learning system, visual knowledge graph, streaming responses, Stripe billing, multi-workspace isolation, the whole thing. All built with Claude Code using this system. Not saying it was easy (it actually took me 9 months almost…). Response time optimization alone took weeks. The retrieval pipeline went through like 5 complete rewrites. But it works, its in production, and real people are using it. Happy to answer questions about any of this. Still figuring out a lot of it tbh but the system works and I’m shipping faster than I ever have. Hope this helps submitted by /u/cryptoviksant

Originally posted by u/cryptoviksant on r/ClaudeCode