Here are some examples: LLMs can’t handle large files . We constantly see file corruption, failed patches and other bullshit. And by large, I don’t mean too large for notepad.exe. I mean roughly over 100 lines, which is not large. Now you might argue: Good coding style is to keep files small, which is true, for many reasons. But LLMs don’t follow good coding style, either. They are surprisingly lazy when it comes to adding new files. Human developers hate adding new files, too, because it means logistics and build authoring. But AI has no business being lazy. “One file per declaration” is a good fallback rule when there is no “smarter” / balanced way to split code. Tedious for humans, easy for AI, and potentially speeds up builds as well, because they can be parallelized better. Why not just do that? LLMs are terrible at running tools and processing diagnostic output They often do what humans would do if they didn’t have an IDE: Run a build, pipe it through Grep, or whatever its ugly Powershell equivalent is, then miss half of the errors and warnings. Humans reviewing the process cannot see the terminal output, because it was filtered. If a coding agent is built into an IDE, why not just use the infrastructure of the IDE to run builds, collect ALL the diagnostic output, leave it in place for humans to review, and have the AI process is afterwards? You can do all of this, but why is it not the default behavior, baked into all those AI tools? LLMs are terrible at following processes A process is anything that consists of multiple steps, maybe loops / iterations. Examples would be: Repeatedly smoke-testing and building to diagnose and fix a bug Executing multiple phases of an implementation plan and doing the logistics on the way (committing, updating TODO lists, building, smoke testing, etc.) Fixing CI failures UNTIL they are fixed, without a human having to tell it to go on after every step The most frustrating part is that these things sometimes DO work, until they don’t, and you never know. You constantly have to babysit your agents. LLMs absolutely suck at refactoring … because they don’t, by default, use the mechanical tools that are available to humans. They simulate the manual process a human would do with nothing by vim and grep, which is awful. If an AI does it, it is much faster, but still awful, and slower than doing it manually with a proper tool. It’s like having a dedicated handyman robot that can only use manual screwdrivers and no power tools, not to mention having power tools built right into it. The verdict None of the above are rocket science. None of these things would require advanced and expensive models. I don’t expect LLMs to solve hard problems of software architecture. That’s MY job, although AI can be a capable assistant, more capable than many human colleagues, and definitely more capable than the average Redditor (sorry, not sorry). But it would be great if we could make LLMs better at handling stupid, basic and tedious things. These are often left to us humans to clean up. submitted by /u/EC36339
Originally posted by u/EC36339 on r/ArtificialInteligence
