Original Reddit post

I’ve spent the last few months heavily optimizing how I work with AI coding agents (mostly Codex and Claude code). I’ve tried a lot of the popular repos, skills, frameworks, and workflows that have been shared across AI and coding subreddits. Most of them look promising, but I usually end up going back to a workflow that’s much simpler and more reliable. I’m curious how different other people’s setups are, what has actually stuck long-term, and where I might still have room to improve. My current workflow

  1. Repository documentation as the source of truth I maintain a customized AGENTS.md plus anywhere from 10–30 smaller markdown files inside a /docs folder. These contain things like: Architecture decisions Best practices Coding conventions Things the AI should avoid Domain-specific knowledge When a document gets too large, I turn it into an orchestration file and split the content into smaller files. The goal is for the AI to read only what’s relevant to the current task rather than loading a giant knowledge dump every time.
  2. Hardening / pre-validation I borrowed a lot of ideas from this Harness Engineering repo: Before the agent starts working, I run a validation phase that checks things like: Local environment is healthy Correct branch is checked out Dependencies are up to date Required services are running Etc. This prevents the agent from derailing halfway through a task because something wasn’t ready. After that, I use a custom skill that breaks the task into smaller steps and saves them into a step_plan.json . I also maintain a lightweight “working memory” file for long-running tasks so context compression doesn’t destroy the session state after a while.
  3. Execution The agent works one step at a time. It’s forced to follow the rules defined in the docs it has loaded and isn’t allowed to skip ahead.
  4. Validation When a step is finished, the agent validates its work against requirements provided in the original prompt. Depending on the task, that might include: Test suites Functional requirements from a PRD Acceptance criteria Custom validation rules Extra tools/skills A few things I use regularly: Caveman for token reduction Graphify for codebase indexing and file discovery I also have some custom skills and guidelines around: E2E testing Playwright workflows Code review behavior Karpathy-style coding guidelines Things I’m still not happy with Memory feels like the weakest part of my setup. My current approach works, but it’s pretty naive. I’ve experimented with various memory frameworks and databases behind agents, but most of them required a lot of custom work and never provided enough value to justify the complexity or the never ending mcp-calls. I also almost always start fresh chats. I rarely continue old conversations unless I’m doing small follow-up fixes and the context window hasn’t been compressed yet. For code quality, I review AI-generated code the same way I’d review a teammate’s PR. One thing I’ve learned is that I don’t need the biggest model for most work. With a good process, proper hardening, and small scopes, models like GPT mini variants, Sonnet, or Haiku are often good enough. I only reach for larger models when doing major refactors, architecture work, or changes that touch a lot of async/concurrent logic. So what’s everyone else using? How do you handle memory? Do you use persistent knowledge stores or just docs/files? What skills, tools, or workflows have actually survived more than a few weeks of real usage? Has anything significantly improved reliability or reduced token usage for you? submitted by /u/Durdinss

Originally posted by u/Durdinss on r/ClaudeCode