I’ve been reading more into how LLMs store information internally, and what I’ve learned is just amazing! The core idea is : instead of one neuron encoding one concept, concepts are encoded in quazi orthogonal directions. This quazi-orthogonality is the key to explain model capacity. Meaning not only are concepts spread across neurons but completly separate concept are allow to have non-zero correlation which actually increase the model capacity exponentially! This is called feature superposition. This is very similar to how our genes encode information. And this not done on purpose by LLM makers, this is an emergent property of the training process and architecture. This is quite fascinating and very unexpected. If you want to read more about this, I’ve written a complete deep dive on the subject : https://thethoughtprocess.xyz/en/how-does-llms-store-knowledge-a-deep-dive-into-feature-superposition submitted by /u/Kindly-Hawk
Originally posted by u/Kindly-Hawk on r/ArtificialInteligence

