Tag

Version Bump

250 articles archived under #version-bump · RSS

vLLM releases dev-tools 17d ago

v0.23.1rc0: [Bugfix][CI] Update Dockerfile dependency graph PNG (#45602)

Signed-off-by: sfeng33 4florafeng@gmail.com

37
r/LocalLLaMA community 17d ago

Xiaomi is now serving MiMo V2.5 at 1000-3000tps using DFlash & Persistent kernel. DFLash model is out, open-source release promised coming soon

https://mimo.xiaomi.com/blog/mimo-tilert-1000tps   submitted by   /u/Dany0 [link]   [comments]

20
r/LocalLLaMA community 18d ago

llama-launcher v1.3 release -> Bayesian Optimisation

Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher , a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses…

16
Hacker News — AI on Front Page community 19d ago

WASI 0.3

https://github.com/WebAssembly/WASI/releases/tag/v0.3.0 Comments URL: https://news.ycombinator.com/item?id=48504063 Points: 213 # Comments: 83

22
r/LocalLLaMA community 20d ago

Best LLM for smut stories

I'm trying to find the best LLM for writing erotica/smut, but there doesn't seem to be that many good models right now. I'm using Cydonia 24B v4.3, which gives great results, but I was wondering if there were even better models that could fit into 16GB VRAM with quantization.…

34
Ollama releases dev-tools 20d ago

v0.30.8-rc0

launch: Fix launch provider drift ( #16683 )

34
Ollama releases dev-tools 20d ago

v0.30.8

launch: Fix launch provider drift ( #16683 )

33
vLLM releases dev-tools 20d ago

v0.23.0rc2: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

Signed-off-by: Mohammad Miadh Angkad 176301910+mmangkad@users.noreply.github.com (cherry picked from commit 40e065e )

36
vLLM releases dev-tools 20d ago

v0.23.0: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

Signed-off-by: Mohammad Miadh Angkad 176301910+mmangkad@users.noreply.github.com (cherry picked from commit 40e065e )

23
Simon Willison community 21d ago

datasette-agent 0.2a0

Release: datasette-agent 0.2a0 Highlights from the release notes: Tools can now ask the user questions mid-execution. Tools that declare a context parameter receive a ToolContext object, and await context.ask_user(...) can ask a yes/no, multiple-choice ( options=[...] ) or…

14
r/MachineLearning community 21d ago

Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P]

Surprised there's no real tooling for this given how much research exists on continual learning. Built pyrecall to fill the gap. Snapshots skill scores before/after fine-tuning, flags regressions, rolls back LoRA adapters by name. Fully local, no external APIs. v0.1.0, MIT, pip…

17
OpenAI Python SDK releases dev-tools 21d ago

v2.41.1

2.41.1 (2026-06-05) Full Changelog: v2.41.0...v2.41.1 Build System Remove scheduled release workflow trigger ( #3366 ) ( 2a91011 )

25
r/MachineLearning community 22d ago

RelayOps - Production-shaped telecom support agent (54% auto-resolve, 0 unsafe actions, full audit + decision console) [P]

I just open-sourced RelayOps - a small, honest, production-shaped AI support agent built specifically for telecom and subscription billing queues. Key results (v1.5.1): 54% of a 50-ticket sample queue auto-resolved 0 unsafe auto-actions 0 billing escapes (tested on 12…

25
Anthropic SDK (Python) releases dev-tools 22d ago

v0.109.1

0.109.1 (2026-06-09) Full Changelog: v0.109.0...v0.109.1 Bug Fixes api: add frontier_llm refusal category ( d3a806b )

35
Hacker News — AI on Front Page community 22d ago

Upcoming breaking changes for npm v12

Article URL: https://github.blog/changelog/2026-06-09-upcoming-breaking-changes-for-npm-v12/ Comments URL: https://news.ycombinator.com/item?id=48467705 Points: 217 # Comments: 68

16
Anthropic SDK (Python) releases dev-tools 22d ago

v0.109.0

0.109.0 (2026-06-09) Full Changelog: v0.108.0...v0.109.0 Features api: add support for Managed Agents deployments and environment variable credentials ( 47633bf )

12
Anthropic SDK (Python) releases dev-tools 22d ago

v0.108.0

0.108.0 (2026-06-09) Full Changelog: v0.107.1...v0.108.0 Features api: add support for claude-mythos-5 and claude-fable-5, with support for server-side fallbacks on refusal ( 6b76649 ) client: adds client-side fallbacks middleware for API providers that do not support…

12
r/LocalLLaMA community 23d ago

Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support

Hey everyone, TinySearch v0.2.0 (first stable beta) is out. The first version used DuckDuckGo directly, which worked well enough to prove the idea, but yeah.. relying on one search source was way too fragile lol. DDG started throwing limits/CAPTCHAs more often in the last 2…

25
Hacker News — AI on Front Page community 23d ago

Let's Encrypt bans certificate usage in any US sanctioned territory [pdf]

Article URL: https://letsencrypt.org/documents/LE-SA-v1.7-June-04-2026-diff.pdf Comments URL: https://news.ycombinator.com/item?id=48453275 Points: 223 # Comments: 172

21
r/LocalLLaMA community 23d ago

Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server

Just saw Xiaomi MiMo announce MiMo-V2.5-Pro UltraSpeed , claiming they broke the 1,000 tokens/sec output barrier on a 1 trillion parameter MoE model . According to them, they’re doing it on a single standard 8-GPU node , not custom wafer-scale hardware like Cerebras and not…

34
Hacker News — AI on Front Page community 23d ago

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

Article URL: https://mimo.xiaomi.com/blog/mimo-tilert-1000tps Comments URL: https://news.ycombinator.com/item?id=48446639 Points: 252 # Comments: 175

30
Ollama releases dev-tools 24d ago

v0.30.7

docs: update docs examples to use Gemma 4 instead of Gemma 3 ( #16607 )

7
Anthropic SDK (Python) releases dev-tools 24d ago

v0.107.1

0.107.1 (2026-06-07) Full Changelog: v0.107.0...v0.107.1 Bug Fixes foundry: send x-api-key header for API-key auth ( #62 ) ( 1338141 ), closes #1661

31
Anthropic SDK (Python) releases dev-tools 25d ago

v0.107.0

0.107.0 (2026-06-06) Full Changelog: v0.106.0...v0.107.0 Features api: small updates to Managed Agents types ( 72923f9 )

35
Ollama releases dev-tools 26d ago

v0.30.7-rc1

openai: align models list with tags ( #16556 )

13
Ollama releases dev-tools 26d ago

v0.30.7-rc0

launch: use native Windows Hermes config path ( #16558 )

5
Anthropic SDK (Python) releases dev-tools 26d ago

v0.106.0

0.106.0 (2026-06-05) Full Changelog: v0.105.2...v0.106.0 Features api: mark Claude Opus 4.1 as deprecated ( 85068cc ) Bug Fixes client: make Foundry client copy() and with_options() work ( 94146ac ) transform schema: preserve $defs when schema root is a $ref ( #1642 ) ( fc58e06…

19
Ollama releases dev-tools 27d ago

v0.30.6-rc0

launch: oh-my-pi ( #16410 )

34
Ollama releases dev-tools 27d ago

v0.30.6

launch: oh-my-pi ( #16410 )

21
r/LocalLLaMA community 27d ago

BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline)

BeeLlama v0.3.0 and v0.3.1 are here! Big architectural update to align the fork with upstream llama.cpp and integrate all its additions like MTP and Gemma 4 12B support, while also updating DFlash to handle complex configurations like multi-slot and multi-GPU. Now also…

5
Ollama releases dev-tools 27d ago

v0.30.5: launch: hermes-desktop app (#16516)

Add support to launch the hermes-desktop app alongside the hermes agent from ollama launch. It will go through the install on first run if hermes-desktop is not already installed.

9
ComfyUI releases dev-tools 27d ago

v0.24.1

ComfyUI v0.24.1

8
Ollama releases dev-tools 27d ago

v0.30.5-rc0: llama.cpp version update (#16511)

Bump llama.cpp to b9509, which includes the upstream Gemma 4 12B multimodal projector fixes for the n_head=0 divide-by-zero crash seen on x86/CUDA/Linux/Windows. Fixes #16479 Fixes #16489 Fixes #16491 Fixes #16492 Fixes #16495

11
r/LocalLLaMA community 28d ago

The first Gemma 4 12B finetunes are ready

Now you can start building your Gemma 4 12B collection :) https://huggingface.co/igorls/gemma-4-12B-it-heretic-GGUF https://huggingface.co/ReadyArt/Melody1437-12B-v0.4-GGUF https://huggingface.co/DuoNeural/Gemma4-12B-IT-Abliterated-GGUF…

26
vLLM releases dev-tools 28d ago

v0.22.1rc2: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

Signed-off-by: khluu khluu000@gmail.com

9
vLLM releases dev-tools 28d ago

v0.22.1: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

Signed-off-by: khluu khluu000@gmail.com

28
OpenAI Python SDK releases dev-tools 28d ago

v2.41.0

2.41.0 (2026-06-03) Full Changelog: v2.40.0...v2.41.0 Features api: responses.moderation and chat_completions.moderation ( 87e46c2 )

33
Ollama releases dev-tools 28d ago

v0.30.4-rc1: llama-server: fix gemma4 patch wiring (#16477)

This will fix the "clip.cpp:4399: Unknown projector type" crash.

4
Ollama releases dev-tools 28d ago

v0.30.4: llama-server: fix gemma4 patch wiring (#16477)

This will fix the "clip.cpp:4399: Unknown projector type" crash.

38
r/LocalLLaMA community 28d ago

Big Model Value Wars - DeepSeek V4 Pro vs MiMo-V2.5-Pro vs MiniMax M3

For those who sometimes boost their local model use with openrouter options, or the madlads who have the infrastructure to actually run those locally, it feels like those three model have the edge in best bang for your buck. How then do you decide which one to use? Do you have a…

19
Hacker News — AI on Front Page community 28d ago

Elixir v1.20: Now a gradually typed language

Article URL: https://elixir-lang.org/blog/2026/06/03/elixir-v1-20-0-released/ Comments URL: https://news.ycombinator.com/item?id=48388324 Points: 252 # Comments: 71

34
Ollama releases dev-tools 28d ago

v0.30.4-rc0: Kill llama-server during Windows cleanup (#16458)

Windows installer and app cleanup could leave llama-server.exe running when ollama.exe was killed directly, so cleanup now includes llama-server.exe and taskkill /T.

28
ComfyUI releases dev-tools 28d ago

v0.24.0

ComfyUI v0.24.0

32
Ollama releases dev-tools 28d ago

v0.30.3

models: add support for gemma4-12b ( #16457 )

30
r/LocalLLaMA community 28d ago

How does the new abliteration tool Apostate compare with others? - Abliterlitics

Why Qwen 2.5 7B? Apostate is a new abliteration tool by heterodoxin. He asked me to benchmark it. Qwen 2.5 7B was recommended by heterodoxin as it's the most tested model for Apostate. I abliterated the model with Heretic v1.3.0 and Apostate. The models are available on…

33
Hugging Face Daily Papers research 29d ago

PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training

Abstract PaddleOCR-VL-1.6 enhances document parsing performance through targeted data optimization and progressive post-training techniques, achieving state-of-the-art results on OmniDocBench v1.6. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce PaddleOCR-VL-1.6, an…

9
vLLM releases dev-tools 29d ago

v0.22.1rc1: [docker] Stop using extra-index-url for flashinfer-jit-cache (#44366)

Signed-off-by: Kevin H. Luu khluu000@gmail.com

34
Ollama releases dev-tools 29d ago

v0.30.2-rc0: fix laguna patch build breakage (#16445)

Follow up to #16396 Fix kernel template instantiation so the symbols are exported in the library.

29
Ollama releases dev-tools 29d ago

v0.30.2: fix laguna patch build breakage (#16445)

Follow up to #16396 Fix kernel template instantiation so the symbols are exported in the library.

38
Ollama releases dev-tools 29d ago

v0.30.1: llm: ignore llama-server SSE ping comments (#16443)

llama.cpp b9478 added a default 30s SSE ping that emits colon-only comment frames (":\n\n") while streamed requests are idle; Ollama treated non-data SSE lines as JSON, so skip SSE comments in completion and chat streams.

36

v0.23.1rc0: [Bugfix][CI] Update Dockerfile dependency graph PNG (#45602)

Xiaomi is now serving MiMo V2.5 at 1000-3000tps using DFlash & Persistent kernel. DFLash model is out, open-source release promised coming soon

llama-launcher v1.3 release -> Bayesian Optimisation

WASI 0.3

Best LLM for smut stories

v0.30.8-rc0

v0.30.8

v0.23.0rc2: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

v0.23.0: [Docker] Fix CUTLASS DSL cu13 install order in Dockerfile (#45204)

datasette-agent 0.2a0

Pyrecall open source tool for detecting catastrophic forgetting during LLM fine-tuning[P]

v2.41.1

RelayOps - Production-shaped telecom support agent (54% auto-resolve, 0 unsafe actions, full audit + decision console) [P]

v0.109.1

Upcoming breaking changes for npm v12

v0.109.0

v0.108.0

Still a VERY lightweight open web-search tool for smaller local LLMs - now with SearXNG support

Let's Encrypt bans certificate usage in any US sanctioned territory [pdf]

Xiaomi just claimed 1,000+ tps on a 1T model using a standard 8-GPU server

MiMo-v2.5-Pro-UltraSpeed: 1T model with 1000 tokens per second

v0.30.7

v0.107.1

v0.107.0

v0.30.7-rc1

v0.30.7-rc0

v0.106.0

v0.30.6-rc0

v0.30.6

BeeLlama v0.3.1 – latest llama.cpp with extras! DFlash, MTP, q6_0 cache, TurboQuant. Single RTX 3090: Qwen 3.6 27B & Gemma 4 31B up to 177.8 tps (4.93x over baseline)

v0.30.5: launch: hermes-desktop app (#16516)

v0.24.1

v0.30.5-rc0: llama.cpp version update (#16511)

The first Gemma 4 12B finetunes are ready

v0.22.1rc2: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

v0.22.1: fix: resolve CUTLASS fmin compatibility for DeepSeek-V4 init

v2.41.0

v0.30.4-rc1: llama-server: fix gemma4 patch wiring (#16477)

v0.30.4: llama-server: fix gemma4 patch wiring (#16477)

Big Model Value Wars - DeepSeek V4 Pro vs MiMo-V2.5-Pro vs MiniMax M3

Elixir v1.20: Now a gradually typed language

v0.30.4-rc0: Kill llama-server during Windows cleanup (#16458)

v0.24.0

v0.30.3

How does the new abliteration tool Apostate compare with others? - Abliterlitics

PaddleOCR-VL-1.6: Expanding the Frontier of Document Parsing with Under-Optimized Region Refinement and Progressive Post-Training

v0.22.1rc1: [docker] Stop using extra-index-url for flashinfer-jit-cache (#44366)

v0.30.2-rc0: fix laguna patch build breakage (#16445)

v0.30.2: fix laguna patch build breakage (#16445)

v0.30.1: llm: ignore llama-server SSE ping comments (#16443)