DoE: Democracy of Experts. Single-file C GGUF runtime with dynamic LoRA experts (3184 LOC, no dependencies)

www.reddit.com

DoE: Democracy of Experts. Single-file C GGUF runtime with dynamic LoRA experts (3184 LOC, no dependencies)

www.reddit.com

eifachposteMB to AI (Reddit RSS)English · 23 hours ago

Original Reddit post

I’ve been experimenting with a different inference architecture for GGUF models. DoE is a single C-file runtime architecture that wraps any GGUF model with a dynamic parliament of LoRA experts that vote and adapt during inference. Compile: cc doe.c -O3 -lm -lpthread -o doe Run: ./doe --model model.gguf --serve 8080 Features: - works with existing GGUF models (Llama, Qwen, Mistral, SmolLM) - weights are mmap’ed read-only - LoRA experts operate on top of the base model - experts vote per token to determine the final residual update - experts can spawn or disappear during inference based on usage - simple gradient-free weight adaptation during generation Other details: - ~3184 LOC single C file - no runtime dependencies - auto-detects tokenizer + chat templates - built-in HTTP chat server - optional CUDA / BLAS acceleration repo: https://github.com/ariannamethod/doe arch: https://github.com/ariannamethod/doe/blob/main/docs/doe///_architecture.md submitted by /u/ataeff

Originally posted by u/ataeff on r/ArtificialInteligence

You must log in or # to comment.

Chat