r/LocalLLaMA
500 articles archived · Visit source ↗ · RSS
-
r/LocalLLaMA community 10d ago
Agent recommendations
Hi, I have a Strix Halo with 128GB setup that runs a couple of models (GPT-OSS 120b, Qwen3.5-122b, Gemma-4-31b) on llama-swap. GPT and Qwen run quite fast at 40-50T/s, while Gemma is a slow 4-5T/s but seems to have the best quality. I'd like to vibe code a personal Webproject in…
17 -
r/LocalLLaMA community 10d ago
GLM-5.2 is on DeepSWE
https://deepswe.datacurve.ai/ Side note, why does this sub dislike DeepSWE? I want to know more and did some research and found this post which has since been retracted by the original author (highly respect them as they handled the correction well and admitted bias) Another…
37 -
r/LocalLLaMA community 10d ago
Your Favorite Workflow to Convert PDF with Complex Structure to Markdown?
I've tried markitdown, Docling, and Mineru. Are there better tools I should try? I need to process tables, floating box, etc. Thanks!   submitted by   /u/chibop1 [link]   [comments]
30 -
-
-
r/LocalLLaMA community 11d ago
Gemma 4 31B Q6 vs Gemma 4 31B QAT
what should i do? i'm stuck been scrolling reddit for hour and no luck. what will be the best in overall scenario. Creative Writing Mainly. what's the kld? help guys.   submitted by   /u/Weak-Shelter-1698 [link]   [comments]
13 -
r/LocalLLaMA community 11d ago
A100 slow Qwen3.6-27B-FP8
Setting up a server for someone who has an A100 80GB, even though this doesn't natively support FP8 does 43tps decode sound too low for single request? For comparison the exact same vllm config on my RTX 6000 PRO runs the same single request test at 130tps. For 8 concurrent…
11 -
-
r/LocalLLaMA community 11d ago
Tokenomics
  submitted by   /u/HOLUPREDICTIONS [link]   [comments]
34 -
-
-
-
-
r/LocalLLaMA community 11d ago
Gemma 4 QAT seems to respond significantly better to KV cache quantization
KLD on wikitext with 16k context My hardware isn't up to testing 31B, if anyone else feels like investigating it would be interesting   submitted by   /u/rima_2711 [link]   [comments]
16 -
-
-
-
-
-