AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems — 68.8% inter-agent leakage vs 27.2% at the output layer

www.reddit.com

AgentLeak: A Full-Stack Benchmark for Privacy Leakage in Multi-Agent LLM Systems — 68.8% inter-agent leakage vs 27.2% at the output layer

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 2 days ago

Original Reddit post

We introduce AgentLeak, the first benchmark to audit all 7 communication channels in multi-agent LLM pipelines — not just the final output. Across 1,000 scenarios in healthcare, finance, legal, and corporate domains, we find: 68.8% inter-agent leakage Only 27.2% leakage at the output layer Output-only monitoring misses 41.7% of violations All 5 tested models (GPT-4o, Claude 3 Opus, Gemini 1.5 Pro, LLaMA-3 70B, Mistral Large) are affected — it’s a systemic architectural issue, not a model bug. 📄 Paper: https://arxiv.org/abs/2602.11510 💻 Code: https://github.com/Privatris/AgentLeak 🌐 Project page: https://privatris.github.io/AgentLeak/ submitted by /u/Plastic_Marzipan5282

Originally posted by u/Plastic_Marzipan5282 on r/ArtificialInteligence

You must log in or # to comment.

Chat