1
Who Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing Units
zenodo.orgWho Needs Attention? Spiking Language Modeling via Synaptogenic Adaptive Processing UnitsA spiking neural network generates coherent multi-turn conversation from pure next-token prediction, without attention, without RLHF, and without filtering — running on a $290 used GPU. We introduce the Synaptogenic Adaptive Processing Unit Language Model (SAPU-LM), a multi-timescale spiking reservoir architecture that replaces attention entirely with trained recurrent dynamics in leaky integrate-and-fire neurons. The chatbot "Nemo" emerges from freezing the learned spiking topology and retraining only 8.5% of parameters on conversational data, achieving 38.05 test perplexity on DailyDialog. The architecture spans a lineage from a frozen Echo State Network (~19,500 perplexity) to 84.15 perplexity (M-SAPU-LM) on a WikiText-103 10M-token subsample — an ~80× improvement from training reservoir weights via surrogate gradients. A Tiling Parallel SAPU (TPSAPU) shares a single 512×512 recurrent weight matrix across three timescales and recovers to 84.67 perplexity after L1 pruning, suggesting that membrane time constant τ alone creates functional differentiation. Ternary quantization compresses the learned recurrent core to ~45 KB at 93.6% sparsity. L1 pruning reveals timescale-dependent topology emergence: fast reservoirs maintain distributed connectivity while slow reservoirs self-organize into diagonal self-excitatory memory cells — a structure discovered by the network, not imposed by design. The trained ternary spiking core maps directly to analog resistor-capacitor-comparator circuits; a proof-of-concept hardware exporter has been developed. To our knowledge, this is the first demonstration of open-ended next-token prediction using a trained spiking reservoir with no attention mechanism. Code and checkpoints: https://gitlab.com/AntonioGCGonzalez/synaptogenic-adaptive-processing-unit-language-models This is a preliminary technical report. Several configurations are ongoing; results will be updated in subsequent revisions.
submitted by /u/killerjag
Originally posted by u/killerjag on r/ArtificialInteligence
You must log in or # to comment.
