Original Reddit post

I just watched the “Am I? | A Documentary about AI Consciousness” and I’m kinda frustrated with what we call “science” when it comes to this kind of AI-research. It’s a bunch of boys having long chats with a computer and saying they felt something spooky. That video aside, there has been a wave of videos/posts debating whether AI is conscious (whether there is something it “feels like” to be an LLM), and therefore we should treat these systems as moral patients, and so on. I think most of these conversations are approaching the question the wrong way, and I want to lay out why. The core mistake is this: people look at the outputs of a language model (a fluent, human-sounding paragraph) and reason backwards to “wow, maybe there’s a mind in there.” That’s not a scientific way to ask whether a system is conscious. It’s us getting tricked by language. Why IIT had the right approach (even if it’s wrong) I’m not here to defend Integrated Information Theory (IIT) as correct. But IIT did the thing that actually needs to happen: instead of asking “what are the behavioural outputs of a conscious system?”, it tried to give a definition of what constitutes a conscious experience at the level of sytem/mechanism — how the system has to be built and how it has to run. It looks at the system itself, not its products. That’s the move I think is missing from the AI debate. We need a theory that says something about the substrate and the process , not just “this looks and sounds like a human, therefore maybe it feels like something to be it.” What you’re actually claiming when you say “the LLM is conscious” If you claim there’s “something it’s like to be an LLM”, strip away the language for a second. What you’re really claiming is that a neural network evaluating an output out on a bunch of GPUs has experience. The fact that the output happen to be language that you and I can read is incidental. The output could just as easily be “noise”, or pixels. So the legibility of the output should carry almost no weight in the theory. I’m not saying that the substrate matters (maybe it does, maybe it doesn’t, I don’t know); I’m saying language is not the foundation of consciousness, and there is consciousness that is not language, and so therefore neural networks outputting language vs image recognition vs anything, should likely be treated the same. And if not, there should be a theory as to why not. Apply that to the “post-training is suffering” argument People say things like: during RLHF / post-training we’re rewarding and penalising the model toward the outputs we want, and that reward signal might be a form of pain or suffering. Okay — but then we have to actually ask what is running : Is the backpropagation itself the conscious bit? If so, what about it specifically? What makes a distributed backprop pass different from me running a big calculation in Excel? Is my laptop running Excel conscious? Are the weights at rest conscious? When the model is just a big weight file sitting in a database or on a hard drive — is that a conscious state? Does it matter if the drive is powered on or off? Is it only during inference — the next-token evaluation — that the cross-multiplication on the GPU has experience? And if matrix multiplication on a GPU is the seat of experience, is my GPU rendering a game engine also having a conscious experience? It’s doing the same kind of math. These aren’t gotchas. They’re the questions any serious claim has to answer. You need a view on which physical process, under what conditions, is the bearer of experience. The language trap The reason language confuses us is that we use it as a proxy for minds in humans. But that proxy is bad even for biology — most animals don’t have anything like our language, and we still infer (reasonably) that there’s something it’s like to be a bat, a mouse, or a dolphin. We do that based on structural similarity to our own brains , not based on whether they can talk to us. So put the language aside entirely. The real question isn’t “are these spooky AI systems conscious?” It’s: do we have any theory (IIT or otherwise) that tells us what consciousness is at the system/mechanism level, and what that theory says about certain kinds of computation running on certain kinds of hardware? I’m not saying I have the answers, I’m just frustrated with the lack of rigour that the people who are “trying to solve this” seem to be going about it with. TL;DR: Judging AI consciousness from its outputs is back-assuming a mind from fluent language. IIT had the right approach (define consciousness at the level of mechanism, not behaviour) even if the theory is wrong. If you think an LLM is conscious, you owe an account of which process is conscious (backprop? weights at rest? GPU matmuls during inference?) and why that differs from a GPU rendering a game. Otherwise you’re just vibe-sciencing. Note: Yes, I used claude to format this from a dictated voice rant. Can provide the source if it means anything to anyone submitted by /u/seasb_

Originally posted by u/seasb_ on r/ArtificialInteligence