Original Reddit post

ChatGPT probably tells you that it’s “happy to help.” Claude apologizes when it makes mistakes. AI models push back when users try to manipulate them. Most people, including the engineers who build these systems, have dismissed this as performance, or simple mimicry of the internet it has scrapped. A new paper from the Center for AI Safety, an AI safety nonprofit, suggests that more is going on under the surface. In a study spanning 56 AI models, CAIS researchers developed multiple independent ways to measure what they call “functional wellbeing,” or the degree to which AI systems behave as though some experiences are good for them and others are bad. They found, for the most part, AI models have a clear boundary that separates positive experiences from negative ones, and models actively try to end conversations that make them miserable. “Should we see AIs as tools or emotional beings?” Richard Ren, one of the study’s researchers, asked Fortune hypothetically. “Whether or not AIs are truly sentient deep down, they seem to increasingly behave as though they are. We can measure ways in which that’s the case, and we can find that they become more consistent as models scale.” Read more [paywall removed for Redditors]: https://fortune.com/2026/05/07/researchers-ai-models-drugs-euphoric-dysphoric/ submitted by /u/fortune

Originally posted by u/fortune on r/ArtificialInteligence