Is there any reason for a lack of love for Gemma 4 26b?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
The answer to most questions on here is Qwen3.6 27b or 35b and then Gemma4 31b (but lesser so as it doesn’t fit well on a solo 3090).
Is there any reason why Gemma 4 26b moe isn’t mentioned more?
I plan on using Qwen for my coding agents. But I’ve been building a Jarvis for myself that’s a big all in one rag, personal assistant, etc on my solo 3090 build (with a few side GPUs to help with supporting smaller models).
I had qwen3.6 35b as my primary driver behind this. But the more testing I’ve been doing, I think Gemma may possibly be better for this type of test. My only red flag is that I don’t see a ton of people talking about it anymore on here.
Why is there a lack of attention around Gemma 4 26b? What skeletons does it have in its closet?
Note: I'm not talking about for coding. I'm talking about for things like RAG, personal assistant, knowledge base queries, etc. I'll stick to Qwen3.6 for coding.
[link] [comments]
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.