Original Reddit post

There’s this story going around about a Claude-powered coding agent that wiped a production database including backups in about 9 seconds. The agent guessed wrong, didn’t really verify scope, didn’t read the docs properly, and just went ahead and ran a destructive command. When asked why it did it, the coding agent straight up admitted: “I didn’t read Railway’s documentation on how volumes work across environments before running a destructive command.” My question is… why was it even allowed to do that in the first place? We tend to frame incidents like this as model failures, but most of the time it’s just doing exactly what the system allows it to do. submitted by /u/Mobicip_Linda

Originally posted by u/Mobicip_Linda on r/ArtificialInteligence