Updates on North Mini Code: 4 bit quant + Ollama + OpenRouter
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hey! We heard the feedback on making the model more portable and accessible. So in light of that we have 2 updates to share. First, you can pull a new 4-bit quant straight from Hugging Face, so it’s now small enough to run on a Mac or whatever local hardware you’ve got. It needs about 20 gigs so if you have that you are good to go. Second, North Mini Code is now supported on Ollama, and any other local runtimes built atop llama.cpp, and it’s also available via the OpenRouter API. we know a lot of you wanted more access, so hoping this lets more devs build more cool stuff. The full docs are here. Excited to hear what you guys think :) [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
A cheap trick for reliable structured output: feed the validation error back into the retry
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.