r/LocalLLaMA · June 23, 2026 · 1 min read

MiniMax2.7 @47tg 1200pp

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

MiniMax 2.7 REAP Q4 on 96GB VRAM and 192 GB DDR5 udimm ram on a b840 MSI board and 9900X cpu. 1250W PSU and all cards are power limited. Linux Ubuntu.

Agent class model. Excellent instruction following and tool calling. I run this model in a round robin loop with 3 sequencing agents running in the CPU. These dreamers are loaded with canonical context in system prompts ranging between 20-40k tokens. I use MoE models for fast sequencing, all around 15-20 tg and 300 PP. Each loop takes 4 to 10 minutes to complete. There is also a dense 12b that is asynchronous that is tasked with watching the whole loop and calling out 1 thing wrong.

submitted by /u/Important_Quote_1180
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA