MTP-only GGUF subsets: Qwen3.5/3.6
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| They are just MTP-only GGUF subsets of Qwen3.5/3.6 Medium/Large (27B and above) models (to accelerate token generation of Qwen-based models without MTP tensors). But I hope they help experimenting with various Qwen3.5/3.6-based fine-tunes. The reason I originally created some of these MTP-only subsets was to accelerate token generation of trohrbaugh/Qwen3.5-122B-A10B-heretic (self-converted version) but the main reason I published them is Ornith-1.0-35B.
Hope that they help someone. Edit (2026-07-01): MTP-only GGUF subset of Qwen3.5-9B is added (since there's many fine-tunes based on this model; there's no plan for 4B or smaller). [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.