News / #version-bump Tag Version Bump 251 articles archived under #version-bump · RSS Sign in to follow Ollama releases dev-tools 29d ago v0.30.1: llm: ignore llama-server SSE ping comments (#16443) llama.cpp b9478 added a default 30s SSE ping that emits colon-only comment frames (":\n\n") while streamed requests are idle; Ollama treated non-data SSE lines as JSON, so skip SSE comments in completion and chat streams. 36 Ollama releases dev-tools 29d ago v0.30.1-rc0 launch: isolate Codex launch configuration ( #16437 ) 7 r/MachineLearning community 29d ago Backpropagation destroys V1 brain alignment in one epoch, tracking RSA alignment to fMRI across training for BP, FA, predictive coding, and STDP [R] Third in a series of papers tracking learning rules vs. human fMRI (THINGS dataset, V1–IT, N=3 subjects). Previous finding: untrained CNNs match backprop at V1. This paper asks: when does training break that, and does the learning rule matter? Setup: RSA alignment measured at 8… 30 OpenAI Python SDK releases dev-tools 1mo ago v2.40.0 2.40.0 (2026-06-01) Full Changelog: v2.39.0...v2.40.0 Features api: Add Amazon Bedrock Responses support Bug Fixes api: allow setting bedrock api keys on the client directly ( 4d5bfde ) 19 Ollama releases dev-tools 1mo ago v0.30.0: launch: migrate Codex config (#16397) launch: migrate Codex config 30 OpenAI Python SDK releases dev-tools 1mo ago v2.39.0 2.39.0 (2026-06-01) Full Changelog: v2.38.0...v2.39.0 Features api: workload identity in audit logs, additional_tools item in responses, fix ActionSearch.query to be optional. ( ab60d7a ) 10 ComfyUI releases dev-tools 1mo ago v0.23.0 What's Changed feat: MediaPipe face detection (CORE-235) by @kijai in #14009 Multi-threaded load of models from disk (big load time speedups & Offload to disk) (CORE-43,CORE-152,CORE-164,CORE-165,CORE-117) by @rattus128 in #13802 Repo security stuff. by @comfyanonymous in #14019… 28 Ollama releases dev-tools 1mo ago v0.30.0-rc32: llama-server followups (#16353) llama-server followups Misc fixes for #16031 Add back dropped ROCm build flag for multi-GPU support on windows Fix amdhip64_*.dll version detection for "latest" selection Fix embeddings API for consistent normalize behavior with prior versions ci: set up for automated llama.cpp… 19 r/LocalLLaMA community 1mo ago mistral.rs v0.8.2: up to 2.8x faster CUDA inference than llama.cpp on GB10, B200, and H100 Hey all! I’ve been working on CUDA performance in mistral.rs, and v0.8.2 is focused on CUDA throughput. The result: on Gemma 4 (dense & MoE), mistral.rs is faster than llama.cpp at every point in my release sweep on GB10/H100/B200. See some results below on GB10 and B200:… 24 r/LocalLLaMA community 1mo ago Llama Studio v0.2.0 I have made an update to my llama-server WebUI based on some awesome feedback and interaction with the community. 1) JSON model config replaced by per-model shell scripts. Run from CLI, paste from unsloth, email to your buddy or post to reddit: Using real shell scripts to store… 17 Hacker News — AI on Front Page community 1mo ago The AV2 Video Standard Has Released (Final v1.0 Specification) Article URL: https://av2.aomedia.org Comments URL: https://news.ycombinator.com/item?id=48340910 Points: 203 # Comments: 80 34 r/LocalLLaMA community 1mo ago this new Moss tts 1.5 is damn good with voice cloning https://huggingface.co/spaces/OpenMOSS-Team/MOSS-TTS-v1.5 I prefer this over fish audio s2 pro because fish audio dont allow commercial use Long Cat DiT 3.5 is also a another good model.   submitted by   /u/9r4n4y [link]   [comments] 38 vLLM releases dev-tools 1mo ago v0.22.1rc0: [CI] Make Model Executor test hangs fail fast with a traceback (#43971) Signed-off-by: khluu khluu000@gmail.com Co-authored-by: Claude noreply@anthropic.com 10 llama.cpp releases dev-tools 1mo ago b9411 model : support for DeepseekV32ForCausalLM with generic DeepSeek Sparse Attention (DSA) implementation ( #23346 ) llama : support DeepSeek V3.2 model family (with DSA lightning indexer) convert : handle DeepseekV32ForCausalLM architecture ggml : support for f16 GGML_OP_FILL… 34 Ollama releases dev-tools 1mo ago v0.30.0-rc31 ci fix - non-shallow MLX checkout 29 Ollama releases dev-tools 1mo ago v0.30.0-rc30 version bump 18 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.2 0.105.2 (2026-05-29) Full Changelog: v0.105.1...v0.105.2 14 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.1 0.105.1 (2026-05-29) Full Changelog: v0.105.0...v0.105.1 Chores internal: use Trusted Publishing for PyPI releases ( 1d04fc5 ) 34 Ollama releases dev-tools 1mo ago v0.30.0-rc29 review comments 24 Anthropic SDK (Python) releases dev-tools 1mo ago v0.105.0 0.105.0 (2026-05-28) Full Changelog: v0.104.1...v0.105.0 Features api: Add support for claude-opus-4-8, mid-conversation system blocks, and usage.output_tokens_details ( f18b014 ) support custom file size caps ( #1825 ) ( 7e5f944 ) Chores examples: rename managed-agents… 12 r/LocalLLaMA community 1mo ago Krasis update: Qwen3.6-35B-A3B (Q4) at reading speed, 1x 8GB 3070 Mobile laptop (32GB RAM) Context Krasis is an LLM runtime for running models that don't fit into VRAM. Krasis streams the model through VRAM from system RAM efficiently and handles prefill and decode as separate architectures and optimised usecases. Latest results (v1.0 release) 1x Laptop RTX 3070… 22 vLLM releases dev-tools 1mo ago v0.22.0rc3: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768) Signed-off-by: Vadim Gimpelson vadim.gimpelson@gmail.com Co-authored-by: Nick Hill nickhill123@gmail.com 20 vLLM releases dev-tools 1mo ago v0.22.0: [BugFix] Fix hard-coded timeout for multi-API-server startup (#43768) Signed-off-by: Vadim Gimpelson vadim.gimpelson@gmail.com Co-authored-by: Nick Hill nickhill123@gmail.com 29 vLLM releases dev-tools 1mo ago v0.22.0rc2: Fix early CUDA init (#43791) Signed-off-by: Harry Mellor 19981378+hmellor@users.noreply.github.com (cherry picked from commit 41688e2 ) 11 Ollama releases dev-tools 1mo ago v0.30.0-rc28 add OLLAMA_IGPU_ENABLE and largely disable iGPUs by default 14 ComfyUI releases dev-tools 1mo ago v0.22.3 ComfyUI v0.22.3 36 r/MachineLearning community 1mo ago Best Text to Text Translation Model? [D] I'm working on a project that translates any language into English. So far, I've tried NMT models like NLLB, MADLAD, and SeamlessM4T v2. The main issue is that they struggle with proper nouns such as: - names - places - dates - organizations I also tried LLMs like Gemma 4, Qwen… 22 r/LocalLLaMA community 1mo ago Info: Nvidia Cuda 13.3 landed Cuda 13.3 Downloads Release Notes Anybody already tried llama.cpp with 13.3?   submitted by   /u/parrot42 [link]   [comments] 18 vLLM releases dev-tools 1mo ago v0.22.0rc1: [MRV2][BugFix] Fix KV connector handling in spec decode case (#43719) Signed-off-by: Nick Hill nickhill123@gmail.com Co-authored-by: Wentao Ye 44945378+yewentao256@users.noreply.github.com (cherry picked from commit 8c94938 ) 18 Ollama releases dev-tools 1mo ago v0.30.0-rc27 ci: windows path workaround for CPU build 20 Ollama releases dev-tools 1mo ago v0.30.0-rc26: Merge remote-tracking branch 'upstream/main' into llama-runner-phase-0 Conflicts: server/images.go server/images_test.go 33 r/LocalLLaMA community 1mo ago OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face MOSS-TTS-v1.5 MOSS-TTS-v1.5 is continued from MOSS-TTS 1.0 . It preserves the main 1.0 capabilities, including zero-shot voice cloning, long-form speech generation, token-level duration control, Pinyin/IPA pronunciation control, multilingual synthesis, and code-switching. For… 10 r/LocalLLaMA community 1mo ago Harbor v0.4.19 - vllm/sglang/llama.cpp launch codex/claude/pi/opencode I'm usually not posting about Harbor releases out of the respect for the community here, but I think v0.4.19 might save a lot of people some time. Harbor can now launch your local agentic coding tools with local inference backends. For example, to run pi + vllm: # model… 26 Ollama releases dev-tools 1mo ago v0.30.0-rc25 ci: fix WoA cross-compile 13 r/LocalLLaMA community 1mo ago MiMo-V2.5-coder Hi, I've just released MiMo-V2.5-coder. If you have 128 Gb, this is an excellent alternative to Qwen3.6 and DS4, especially for coding. Fast, and with reliable tool calling. Give it a try!   submitted by   /u/jedisct1 [link]   [comments] 7 Ollama releases dev-tools 1mo ago v0.30.0-rc24 version bump 20 r/MachineLearning community 1mo ago LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P] Solo author here. I spent the last six months building (and then sunsetting) a marketplace for AI training data. The marketplace failed for an interesting reason: the actual bottleneck isn't supply. There's tons of data. The bottleneck is that buyers can't independently evaluate… 14 r/LocalLLaMA community 1mo ago BeeLlama v0.2.0 – major DFlash update. Single RTX 3090: Qwen 3.6 27B up to 164 tps (4.40x), Gemma 4 31B up to 177.8 tps (4.93x). Prompt processing speed near baseline. BeeLlama v0.2.0 is here! Not quite a pegasus, but close enough. GitHub | Qwen 3.6 27B Quick Start | Gemma 4 31B Quick Start Full Gemma 4 31B support with efficient DFlash implementation and vision. Major Qwen 3.6 27B performance update from lower DFlash overhead, cleaner prefill… 28 ComfyUI releases dev-tools 1mo ago v0.22.2 ComfyUI v0.22.2 6 r/LocalLLaMA community 1mo ago trained a prompt injection detector using ml-intern and DeepSeek v4 Flash, runs in the browser Trained a prompt injection classifier using ml-intern + DeepSeek v4 Flash. DistilBERT, F1 99%, ONNX int8, ~65 MB, runs in browser with Transformers.js v3. You can try it here: https://huggingface.co/spaces/av-codes/prompt-injection-detector --- I've been interested in prompt… 5 Ollama releases dev-tools 1mo ago v0.30.0-rc23 lint fix 8 Anthropic SDK (Python) releases dev-tools 1mo ago v0.104.1 0.104.1 (2026-05-21) Full Changelog: v0.104.0...v0.104.1 Bug Fixes streaming: carry encrypted_content through beta compaction accumulator ( #1821 ) ( f7a720c ) 29 Hacker News — AI on Front Page community 1mo ago Deno 2.8 Article URL: https://deno.com/blog/v2.8 Comments URL: https://news.ycombinator.com/item?id=48234380 Points: 215 # Comments: 98 27 ComfyUI releases dev-tools 1mo ago v0.22.1 ComfyUI v0.22.1 18 OpenAI Python SDK releases dev-tools 1mo ago v2.38.0 2.38.0 (2026-05-21) Full Changelog: v2.37.0...v2.38.0 Features api: api update ( 33d1d01 ) api: manual updates ( a21700a ) api: update OpenAPI spec or Stainless config ( 00265c5 ) Chores api: docs updates ( ee10152 ) check release PR custom code sync ( 2638779 ) remove release… 26 Anthropic SDK (Python) releases dev-tools 1mo ago v0.104.0 0.104.0 (2026-05-21) Full Changelog: v0.103.1...v0.104.0 Features api: Add support for thinking-token-count beta for estimated tokens in thinking block deltas when streaming ( 80d0fdf ) 7 Ollama releases dev-tools 1mo ago v0.30.0-rc22 version bump 5 r/LocalLLaMA community 1mo ago LlamaStation v0.9 — llama.cpp GUI for Windows with multi-backend support, TurboQuant, MTP and more I've been building this for the past few months as a side project — started because I didn't want to run llama.cpp from the command line every time I wanted to try a model. I just wanted something that worked with a click. Fair warning: I'm not a developer. This is 100% vibe… 33 ComfyUI releases dev-tools 1mo ago v0.22.0 ComfyUI v0.22.0 30 llama.cpp releases dev-tools 1mo ago b9246: snapdragon: update toolchain to v0.6 (#23369) snapdragon: update compiler flags to enable all CPU features snapdragon: update readme to point to toolchain v0.6 snapdragon: bump toolchain docker to v0.6 37 Page 3 of 6 · 251 articles ← Newer Older →