Building on an idea I’ve had, here’s what the first version of EffectiveTPS will look like. Core display (v1):
- Clean table comparing popular local models
- Raw TPS (the marketing number everyone shows)
- eTPS (the new metric that actually measures useful output in real conversations)
- Time to First Token (how long you wait before it starts replying)
- Effectiveness Index = (eTPS ÷ Raw TPS) × 100 — higher is better Example leaderboard (early test data): | Model | Raw TPS | eTPS | Time to First Token | Effectiveness Index | |--------------------|---------|--------|---------------------|---------------------| | Llama 3.1 70B | 45.2 | 38.7 | 1.4s | 86 | | Qwen2.5-32B | 68.4 | 52.1 | 0.8s | 76 | | Gemma 2 27B | 71.3 | 44.6 | 0.6s | 63 | I’ve been running these tests through a structured multi-turn analysis framework I built to evaluate complex workflows. That’s how eTPS was stress-tested — not just single-turn benchmarks, but real back-and-forth sessions. Advanced mode (toggle) will add latency percentiles, cost-per-quality, and consistency scoring later. For v1 the goal is to keep it dead simple and immediately useful, even if you’re not deep into AI. The whole point is to cut through the noise and show which models actually deliver useful work, not just raw speed. What do you think should be added (or removed) for the first version? Any metrics you’d want to see front-and-center? TL;DR: Simple leaderboard with Raw TPS, eTPS, Time to First Token, and a clear Effectiveness Index. Advanced stuff stays hidden until you want it. Feedback welcome. submitted by /u/axendo
Originally posted by u/axendo on r/ArtificialInteligence
You must log in or # to comment.
