Your agent might no longer primarily works on solving the problem, but on fulfilling generic meta-instructions. You’ve probably heard of the “Andrej Karpathy Instructions” or “andrej-karpathy-skills” that are on everyone’s lips right now: four principles derived from an X post by Andrej Karpathy to drastically reduce AI agent misbehavior. The idea is to simply put these instructions into the global or project-specific AGENTS.md / CLAUDE.md and thus improve the agent’s behavior in a problem-agnostic way. That sounds great at first, and predefined instructions can certainly be useful in certain situations. Especially with larger tasks, things like verification, self-checks, or retry loops can be helpful. But I’ve found that it’s not that simple: With such “catch-all instructions” , you define generic optimization goals before it’s even clear what problem needs to be solved. This is context pollution and leads to the agent being directed in a specific direction far too early, without considering the actual problem and the necessary steps. Sometimes, for example, there are very small tasks that need to be done. We pollute the context with things like: “Define success criteria. Loop until verified.” But there isn’t always a clear “success criterion” to work toward. The agent, however, desperately tries to find one, because this exact behavior was demanded by the instructions. This can lead to completely meaningless tests that are supposed to supposedly prove the “success criterion” . This not only wastes tokens, but also creates garbage in the project. That’s precisely the real problem for me: The agent formally follows the instructions, but drifts away from the actual goal. What do you think? Reference: https://github.com/multica-ai/andrej-karpathy-skills submitted by /u/mrclrchtr
Originally posted by u/mrclrchtr on r/ClaudeCode
