r/LocalLLaMA · June 1, 2026 · 1 min read

Mellum 2 12B A2.5B

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Coding focused small MoE from JetBrains. They claim coding performance around Qwen 3.5 9B for the reasoning model. Worse than Qwen 3.5 4B in in everything else.

Models: https://huggingface.co/collections/JetBrains/mellum-2

Technical report: https://arxiv.org/abs/2605.31268

submitted by /u/Middle_Bullfrog_6173
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA