Qwen3.6 huge quality gain from Q4 to Q6 for coding agent
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
So, last week I tried to update my unused local LLM setup. I had to stop using it because quality was too low and deepseek was too cheap.
First thing I stopped using Ollama and now I only use llama.cpp built in server that works really great.
The quality improvement from Q4 to Q6 is outstanding and finally a local LLM server can work very similarly to paid APIs.
That's great! And MTP makes a big performance gain, on a dual 3090 (downvolted and limited to 65°C) it generates from 20 to 50 tokens per second with minimal heat generation.
So yes, that time has finally arrived! Local coding agents are a thing and they work 😎
[link] [comments]
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.