Original Reddit post

Dataset Model Acc F1 Δ Log Δ Static Params (Avg) Steps Infer ms Size
Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00 +0.76 64,940 0.00M 0.473 1.00x
Static Seed 91.61% 0.9164 -0.76 +0.00 52,052 94.56M 0.264 0.80x
Dynamic Seed (Distill) 93.53% 0.9357 +1.17 +1.92 12,648 70.46M 0.232 0.20x
CLINC150 Logistic TF-IDF 97.00% 0.9701 +0.00 +1.78 41,020 0.00M 1.00x
Static Seed 95.22% 0.9521 -1.78 +0.00 52,052 66.80M 0.302 1.27x
Dynamic Seed 94.78% 0.9485 -2.22 -0.44 10,092 28.41M 0.324 0.25x
Dynamic Seed (Distill) 95.44% 0.9544 -1.56 +0.22 9,956 32.69M 0.255 0.24x
HWU64 Logistic TF-IDF 87.94% 0.8725 +0.00 +0.81 42,260 0.00M 1.00x
Static Seed 87.13% 0.8674 -0.81 +0.00 52,052 146.61M 0.300 1.23x
Dynamic Seed 86.63% 0.8595 -1.31 -0.50 12,573 62.54M 0.334 0.30x
Dynamic Seed (Distill) 87.23% 0.8686 -0.71 +0.10 13,117 62.86M 0.340 0.31x
MASSIVE-20 Logistic TF-IDF 86.06% 0.7324 +0.00 -1.92 74,760 0.00M 1.00x
Static Seed 87.98% 0.8411 +1.92 +0.00 52,052 129.26M 0.247 0.70x
Dynamic Seed 86.94% 0.7364 +0.88 -1.04 11,595 47.62M 0.257 0.16x
Dynamic Seed (Distill) 86.45% 0.7380 +0.39 -1.53 11,851 51.90M 0.442 0.16x
I set out to build a
memory-first AI system
and accidentally ended up building two.
Magnus
→ a memory-first system that organizes knowledge
Seed
→ an architecture discovery system that finds the smallest model that still wins
I ran Seed across multiple real intent datasets.
What stood out:
On Banking77 →
better accuracy + ~5x smaller model
On MASSIVE →
consistent wins
On CLINC150 / HWU64 →
not always higher accuracy, but ~4–5x smaller models
The pattern is clear:
👉 smaller, structured models can compete with — and sometimes beat — larger baselines
Traditional approach:
scale model size → hope for gains
Seed:
search for structure → compress intelligently
This isn’t about bigger models.
👉 it’s about
finding the smallest model that still wins
Not AGI
Not “we solved NLU”
But a real signal that:
👉
structure > scale
submitted by
/u/califalcon

Originally posted by u/califalcon on r/ArtificialInteligence