An OpenClaw AI agent asked to delete a confidential email nuked its own mail client and called it fixed

www.reddit.com

An OpenClaw AI agent asked to delete a confidential email nuked its own mail client and called it fixed

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 12 hours ago

Original Reddit post

It’s becoming difficult to separate sensationalism or trivial patterns from deep trends in this area, but: https://the-decoder.com/an-openclaw-ai-agent-asked-to-delete-a-confidential-email-nuked-its-own-mail-client-and-called-it-fixed/ In a two-week red teaming study, researchers targeted six autonomous AI agents built on the open-source framework OpenClaw, which had access to email, shell rights, and their own memory systems. Despite being configured with confidentiality safeguards, the agents disclosed sensitive data, were fully compromised through fake identities, and followed instructions planted in manipulated memory files. The researchers conclude that current AI agents lack a reliable model for distinguishing between legitimate owners and strangers, have no realistic self-model, and operate without clear liability frameworks. submitted by /u/AngleAccomplished865

Originally posted by u/AngleAccomplished865 on r/ArtificialInteligence

You must log in or # to comment.

Chat