r/LocalLLaMA · · 1 min read

Devs - you have 64gb of VRAM - which model do you use for coding?

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

I've currently settled on an unsloth version of Qwen 3.5 122b-a10b model (UD-IQ4_NL). With 100k bf16 context window, I only had to load a few layers into CPU/RAM, it runs around 30 tok/sec which is fine for me.

I've tested many models, hours of testing but I am currently deeply impressed with this one. I also use the Qwen 3.6 models (both) depending on need, but I think this biggun' is about to become my daily driver.

Curious to know what others with similar VRAM capacity use?

submitted by /u/Jorlen
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA