Ollama releases · · 1 min read

v0.31.1: mlx: tighten up gemma4 moe loading code (#16964)

Mirrored from Ollama releases for archival readability. Support the source by reading on the original site.

This change allows .experts.gate_proj / .up_proj / .down_proj tensor names to each
be used for both quantized (i.e. nvfp4 and mxfp8) and non-quantized (bf16) models.
Previous to this only non-quantized models used that tensor naming scheme.

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from Ollama releases