SWE-rebench leaderboard update: GLM-5.2, Qwen3.6-27B, Qwen3.6-35B-A3B, Gemma 4 31B and more + improved UI
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hi all, We made several updates to the SWE-rebench leaderboard: added new models, refreshed recent results, and reworked the leaderboard UI to make results easier to read, compare, and understand. New Models:
For r/LocalLLaMA, the most interesting part is probably the local / self-hosted model results. Qwen3.6-27B is quite strong for its size, while Qwen3.6-35B-A3B and Gemma 4 31B are also now on the board for comparison. Which local models should we test ? Let us know which ones you use for coding agents or local development, and we’ll consider adding them in future updates. Links: > Leaderboard: https://swe-rebench.com/ > Our discord: https://discord.gg/V8FqXQ4CgU > X post with the update: https://x.com/ibragim_bad/status/2072318238407483593?s=20 > Harbor (If you want to run Agent on your own) : https://hub.harborframework.com/datasets/swe-rebench/swe-rebench-leaderboard/latest [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.