News / #version-bump Tag Version Bump 250 articles archived under #version-bump · RSS Sign in to follow vLLM releases dev-tools 17d ago v0.23.1rc0: [Bugfix][CI] Update Dockerfile dependency graph PNG (#45602) Signed-off-by: sfeng33 4florafeng@gmail.com 37 r/LocalLLaMA community 17d ago Xiaomi is now serving MiMo V2.5 at 1000-3000tps using DFlash & Persistent kernel. DFLash model is out, open-source release promised coming soon https://mimo.xiaomi.com/blog/mimo-tilert-1000tps   submitted by   /u/Dany0 [link]   [comments] 20 r/LocalLLaMA community 18d ago llama-launcher v1.3 release -> Bayesian Optimisation Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher , a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses… 16 Hacker News — AI on Front Page community 19d ago WASI 0.3 https://github.com/WebAssembly/WASI/releases/tag/v0.3.0 Comments URL: https://news.ycombinator.com/item?id=48504063 Points: 213 # Comments: 83 22 r/LocalLLaMA community 20d ago Best LLM for smut stories I'm trying to find the best LLM for writing erotica/smut, but there doesn't seem to be that many good models right now. I'm using Cydonia 24B v4.3, which gives great results, but I was wondering if there were even better models that could fit into 16GB VRAM with quantization.… 34 Ollama releases dev-tools 20d ago v0.30.8-rc0 launch: Fix launch provider drift ( #16683 ) 34 Ollama releases dev-tools 20d ago v0.30.8 launch: Fix launch provider drift ( #16683 ) 33 vLLM releases dev-tools 20d ago v0.23.0rc2: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204) Signed-off-by: Mohammad Miadh Angkad 176301910+mmangkad@users.noreply.github.com (cherry picked from commit 40e065e ) 36 vLLM releases dev-tools 20d ago v0.23.0: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204) Signed-off-by: Mohammad Miadh Angkad 176301910+mmangkad@users.noreply.github.com (cherry picked from commit 40e065e ) 23 Simon Willison community 21d ago datasette-agent 0.2a0 Release: datasette-agent 0.2a0 Highlights from the release notes: Tools can now ask the user questions mid-execution. Tools that declare a context parameter receive a ToolContext object, and await context.ask_user(...) can ask a yes/no, multiple-choice ( options=[...] ) or… 14 r/MachineLearning community 21d ago Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P] Surprised there's no real tooling for this given how much research exists on continual learning. Built pyrecall to fill the gap. Snapshots skill scores before/after fine-tuning, flags regressions, rolls back LoRA adapters by name. Fully local, no external APIs. v0.1.0, MIT, pip… 17 OpenAI Python SDK releases dev-tools 21d ago v2.41.1 2.41.1 (2026-06-05) Full Changelog: v2.41.0...v2.41.1 Build System Remove scheduled release workflow trigger ( #3366 ) ( 2a91011 ) 25 r/MachineLearning community 22d ago RelayOps - Production-shaped telecom support agent (54% auto-resolve, 0 unsafe actions, full audit + decision console) [P] I just open-sourced RelayOps - a small, honest, production-shaped AI support agent built specifically for telecom and subscription billing queues. Key results (v1.5.1): 54% of a 50-ticket sample queue auto-resolved 0 unsafe auto-actions 0 billing escapes (tested on 12… 25 Anthropic SDK (Python) releases dev-tools 22d ago v0.109.1 0.109.1 (2026-06-09) Full Changelog: v0.109.0...v0.109.1 Bug Fixes api: add frontier_llm refusal category ( d3a806b ) 35 Hacker News — AI on Front Page community 22d ago Upcoming breaking changes for npm v12 Article URL: https://github.blog/changelog/2026-06-09-upcoming-breaking-changes-for-npm-v12/ Comments URL: https://news.ycombinator.com/item?id=48467705 Points: 217 # Comments: 68 16 Anthropic SDK (Python) releases dev-tools 22d ago v0.109.0 0.109.0 (2026-06-09) Full Changelog: v0.108.0...v0.109.0 Features api: add support for Managed Agents deployments and environment variable credentials ( 47633bf ) 12 Anthropic SDK (Python) releases dev-tools 22d ago v0.108.0 0.108.0 (2026-06-09) Full Changelog: v0.107.1...v0.108.0 Features api: add support for claude-mythos-5 and claude-fable-5, with support for server-side fallbacks on refusal ( 6b76649 ) client: adds client-side fallbacks middleware for API providers that do not support… 12 r/LocalLLaMA community 23d ago Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support Hey everyone, TinySearch v0.2.0 (first stable beta) is out. The first version used DuckDuckGo directly, which worked well enough to prove the idea, but yeah.. relying on one search source was way too fragile lol. DDG started throwing limits/CAPTCHAs more often in the last 2… 25 Hacker News — AI on Front Page community 23d ago Let's Encrypt bans certificate usage in any US sanctioned territory [pdf] Article URL: https://letsencrypt.org/documents/LE-SA-v1.7-June-04-2026-diff.pdf Comments URL: https://news.ycombinator.com/item?id=48453275 Points: 223 # Comments: 172 21 r/LocalLLaMA community 23d ago Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server Just saw Xiaomi MiMo announce MiMo-V2.5-Pro UltraSpeed , claiming they broke the 1,000 tokens/sec output barrier on a 1 trillion parameter MoE model . According to them, they’re doing it on a single standard 8-GPU node , not custom wafer-scale hardware like Cerebras and not… 34 Hacker News — AI on Front Page community 23d ago MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second Article URL: https://mimo.xiaomi.com/blog/mimo-tilert-1000tps Comments URL: https://news.ycombinator.com/item?id=48446639 Points: 252 # Comments: 175 30 Ollama releases dev-tools 24d ago v0.30.7 docs: update docs examples to use Gemma 4 instead of Gemma 3 ( #16607 ) 7 Anthropic SDK (Python) releases dev-tools 24d ago v0.107.1 0.107.1 (2026-06-07) Full Changelog: v0.107.0...v0.107.1 Bug Fixes foundry: send x-api-key header for API-key auth ( #62 ) ( 1338141 ), closes #1661 31 Anthropic SDK (Python) releases dev-tools 25d ago v0.107.0 0.107.0 (2026-06-06) Full Changelog: v0.106.0...v0.107.0 Features api: small updates to Managed Agents types ( 72923f9 ) 35 Ollama releases dev-tools 26d ago v0.30.7-rc1 openai: align models list with tags ( #16556 ) 13 Ollama releases dev-tools 26d ago v0.30.7-rc0 launch: use native Windows Hermes config path ( #16558 ) 5 Anthropic SDK (Python) releases dev-tools 26d ago v0.106.0 0.106.0 (2026-06-05) Full Changelog: v0.105.2...v0.106.0 Features api: mark Claude Opus 4.1 as deprecated ( 85068cc ) Bug Fixes client: make Foundry client copy() and with_options() work ( 94146ac ) transform schema: preserve $defs when schema root is a $ref ( #1642 ) ( fc58e06… 19 Ollama releases dev-tools 27d ago v0.30.6-rc0 launch: oh-my-pi ( #16410 ) 34 Ollama releases dev-tools 27d ago v0.30.6 launch: oh-my-pi ( #16410 ) 21 r/LocalLLaMA community 27d ago BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline) BeeLlama v0.3.0 and v0.3.1 are here! Big architectural update to align the fork with upstream llama.cpp and integrate all its additions like MTP and Gemma 4 12B support, while also updating DFlash to handle complex configurations like multi-slot and multi-GPU. Now also… 5 Ollama releases dev-tools 27d ago v0.30.5: launch: hermes-desktop app (#16516) Add support to launch the hermes-desktop app alongside the hermes agent from ollama launch. It will go through the install on first run if hermes-desktop is not already installed. 9 ComfyUI releases dev-tools 27d ago v0.24.1 ComfyUI v0.24.1 8 Ollama releases dev-tools 27d ago v0.30.5-rc0: llama.cpp version update (#16511) Bump llama.cpp to b9509, which includes the upstream Gemma 4 12B multimodal projector fixes for the n_head=0 divide-by-zero crash seen on x86/CUDA/Linux/Windows. Fixes #16479 Fixes #16489 Fixes #16491 Fixes #16492 Fixes #16495 11 r/LocalLLaMA community 28d ago The first Gemma 4 12B finetunes are ready Now you can start building your Gemma 4 12B collection :) https://huggingface.co/igorls/gemma-4-12B-it-heretic-GGUF https://huggingface.co/ReadyArt/Melody1437-12B-v0.4-GGUF https://huggingface.co/DuoNeural/Gemma4-12B-IT-Abliterated-GGUF… 26 vLLM releases dev-tools 28d ago v0.22.1rc2: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init Signed-off-by: khluu khluu000@gmail.com 9 vLLM releases dev-tools 28d ago v0.22.1: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init Signed-off-by: khluu khluu000@gmail.com 28 OpenAI Python SDK releases dev-tools 28d ago v2.41.0 2.41.0 (2026-06-03) Full Changelog: v2.40.0...v2.41.0 Features api: responses.moderation and chat_completions.moderation ( 87e46c2 ) 33 Ollama releases dev-tools 28d ago v0.30.4-rc1: llama-server: fix gemma4 patch wiring (#16477) This will fix the "clip.cpp:4399: Unknown projector type" crash. 4 Ollama releases dev-tools 28d ago v0.30.4: llama-server: fix gemma4 patch wiring (#16477) This will fix the "clip.cpp:4399: Unknown projector type" crash. 38 r/LocalLLaMA community 28d ago Big Model Value Wars - DeepSeek V4 Pro vs MiMo-V2.5-Pro vs MiniMax M3 For those who sometimes boost their local model use with openrouter options, or the madlads who have the infrastructure to actually run those locally, it feels like those three model have the edge in best bang for your buck. How then do you decide which one to use? Do you have a… 19 Hacker News — AI on Front Page community 28d ago Elixir v1.20: Now a gradually typed language Article URL: https://elixir-lang.org/blog/2026/06/03/elixir-v1-20-0-released/ Comments URL: https://news.ycombinator.com/item?id=48388324 Points: 252 # Comments: 71 34 Ollama releases dev-tools 28d ago v0.30.4-rc0: Kill llama-server during Windows cleanup (#16458) Windows installer and app cleanup could leave llama-server.exe running when ollama.exe was killed directly, so cleanup now includes llama-server.exe and taskkill /T. 28 ComfyUI releases dev-tools 28d ago v0.24.0 ComfyUI v0.24.0 32 Ollama releases dev-tools 28d ago v0.30.3 models: add support for gemma4-12b ( #16457 ) 30 r/LocalLLaMA community 28d ago How does the new abliteration tool Apostate compare with others? - Abliterlitics Why Qwen 2.5 7B? Apostate is a new abliteration tool by heterodoxin. He asked me to benchmark it. Qwen 2.5 7B was recommended by heterodoxin as it's the most tested model for Apostate. I abliterated the model with Heretic v1.3.0 and Apostate. The models are available on… 33 Hugging Face Daily Papers research 29d ago PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training Abstract PaddleOCR-VL-1.6 enhances document parsing performance through targeted data optimization and progressive post-training techniques, achieving state-of-the-art results on OmniDocBench v1.6. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce PaddleOCR-VL-1.6, an… 9 vLLM releases dev-tools 29d ago v0.22.1rc1: [docker] Stop using extra-index-url for flashinfer-jit-cache (#44366) Signed-off-by: Kevin H. Luu khluu000@gmail.com 34 Ollama releases dev-tools 29d ago v0.30.2-rc0: fix laguna patch build breakage (#16445) Follow up to #16396 Fix kernel template instantiation so the symbols are exported in the library. 29 Ollama releases dev-tools 29d ago v0.30.2: fix laguna patch build breakage (#16445) Follow up to #16396 Fix kernel template instantiation so the symbols are exported in the library. 38 Ollama releases dev-tools 29d ago v0.30.1: llm: ignore llama-server SSE ping comments (#16443) llama.cpp b9478 added a default 30s SSE ping that emits colon-only comment frames (":\n\n") while streamed requests are idle; Ollama treated non-data SSE lines as JSON, so skip SSE comments in completion and chat streams. 36 Page 2 of 5 · 250 articles ← Newer Older →