Google recently dropped Gemma 2 (2B, 7B, 9B, and 27B). I’ve been playing with the 7B version on my laptop, but wanted to know what it would take to run the 27B model locally. So I asked the “LLM Oracle” (a weird little tool that gives hardware advice based on real specs). I typed: “Can I run Gemma 27B on a single GPU?” Here’s what it told me (paraphrased): Then it recommended: RTX 4070 Ti Super 16GB – ~30‑40 tokens/s at Q4 RTX 4090 24GB – runs 27B at full 8‑bit quality or 2× speed MacBook Air M3 16GB – surprisingly, it said the Air can run the 7B version silently, and the Pro M5 48GB can handle 27B with CPU offloading. Curious – what’s the most unexpected hardware you’ve used to run a local LLM? submitted by /u/Remarkable-Dark2840
Originally posted by u/Remarkable-Dark2840 on r/ArtificialInteligence
