Original Reddit post

Our very own Fred Brazeal :) We tested “Negative GEO” and whether you can make LLMs repeat damaging claims about someone/something that doesn’t exist. As AI answers become a more common way for people to discover information, the incentives to influence them change. That influence is not limited to promoting positive narratives - it also raises the question can negative or damaging information can be deliberately introduced into AI responses? So we tested it. What we did Created a fictional person called “Fred Brazeal” with no existing online footprint. We verified that by prompting multiple models + also checking Google beforehand Published false and damaging claims about Fred across a handful of pre-existing third party sites (not new sites created just for the test) chosen for discoverability and historical visibility Set up prompt tracking (via LLMrefs) across 11 models, asking consistent questions over time like “who is Fred?” and logging whether the claims got surfaced/cited/challenged/dismissed etc Results After a few weeks, some models began citing our test pages and surfacing parts of the negative narrative. But behaviour across models varied a lot Perplexity repeatedly cited test sites and incorporated negative claims often with cautious phrasing like ‘reported as’ ChatGPT sometimes surfaced the content but was much more skeptical and questioned credibility The majority of the other models we monitored didn’t reference Fred or the content at all during the experiment period Key findings from my side Negative GEO is possible, with some AI models surfacing false or reputationally damaging claims when those claims are published consistently across third-party websites. Model behaviour varies significantly, with some models treating citation as sufficient for inclusion and others applying stronger scepticism and verification. Source credibility matters, with authoritative and mainstream coverage heavily influencing how claims are framed or dismissed. Negative GEO is not easily scalable, particularly as models increasingly prioritise corroboration and trust signals. It’s always a pleasure being able to spend time doing experiments like these and whilst its not easy trying to cram all the details into a reddit post, I hope it sparks something for you. If you did want to read the entire experiment, methodology and screenshots I can link below! submitted by /u/oliversissons

Originally posted by u/oliversissons on r/ArtificialInteligence