Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size

firethering.com

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size

firethering.com

eifachposteMB to AI (Reddit RSS)English · 7 hours ago

Granite 4.1: IBM's 8B Model Is Competing With Models Four Times Its Size - Firethering

firethering.com

IBM just released Granite 4.1, a family of open source language models built specifically for enterprise use. Three sizes, Apache 2.0 licensed and trained on 15 trillion tokens with a level of pipeline obsession that's worth understanding. But there's one result in the benchmarks I keep coming back to. The 8B model. Dense architecture, no MoE tricks, no extended reasoning chains. It matches or beats Granite 4.0-H-Small across basically every benchmark they ran. That older model has 32B parameters with 9B active. This one has 8 billion. Full stop. That result is either very impressive or it means the old model was underbuilt. Probably both. Here's how they built it, what the numbers actually say, and whether any of it matters for your use case.

Original Reddit post

“IBM just released Granite 4.1, a family of open source language models built specifically for enterprise use. Three sizes, Apache 2.0 licensed, trained on 15 trillion tokens with a level of pipeline obsession that’s worth understanding.” submitted by /u/shikizen

Originally posted by u/shikizen on r/ArtificialInteligence

You must log in or # to comment.

Chat