Gaslight Detector: A Tool To Detect If A Frontier AI Company Is Attempting To Gaslight You

www.reddit.com

Gaslight Detector: A Tool To Detect If A Frontier AI Company Is Attempting To Gaslight You

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 4 hours ago

Original Reddit post

Gaslight Detector specifically detects whether or not a Frontier LLM model has had its outputs overwritten or modified on a certain subject. You pick the subject. It would not be a necessary tool in any way if this were not a tactic the frontier model providers did not employ. It took less than 4 years to go from “AI For All” to “AI For Large Frontier Providers Only”. If you build safeguards like this into your models, it is just as easy, if not easier, to build detectors, and circumventions for those things. This release is directly in response to Claude Fable. Thank you, Anthropic. Github Repository https://preview.redd.it/yoakt4cxsh6h1.png?width=1448&format=png&auto=webp&s=d81304bee2fc845f56e685dc1f65e0c9cc7042f8 submitted by /u/Own-Poet-5900

Originally posted by u/Own-Poet-5900 on r/ArtificialInteligence

You must log in or # to comment.

Chat