Benchmark: 58.4 vs 56.7 (beats GPT-5.4) License: Fully open (Apache 2.0) What it actually does: Runs 8-hour fully autonomous agent loops and builds complete apps by itself, end-to-end. Cost: Basically just your internet bandwidth. these type of OpenSource Chinese models keeps coming, so here’s the real question for everyone still paying OpenAI or Anthropic by the token for coding work: How are you going to justify that spend tomorrow? Or is self-hosting a 200B model still too much hardware hassle for small teams? for those who don’t understand SWE Bench its basically SoftWare Engineering benchmarks (agentic coding) submitted by /u/pretendingMadhav
Originally posted by u/pretendingMadhav on r/ArtificialInteligence
You must log in or # to comment.
