News / #model-release Tag Model releases 500 articles archived under #model-release · RSS Sign in to follow TechCrunch — AI news-outlet 5d ago Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market. 29 r/LocalLLaMA community 5d ago We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant) Hey everyone, We just released our first release candidate from Spectral Labs: a Qwen3.5 0.8B Q4_K_M built using a new calibration-aware quantization approach we're calling SpectralQuant . The goal here was to see if we could make a standard Q4_K_M footprint behave more like a… 15 Ahead of AI (Sebastian Raschka) research 5d ago Using Local Coding Agents Using Open-Weight Models in Local Coding Harnesses as an Alternative to Claude Code and Codex Subscriptions 17 r/LocalLLaMA community 5d ago Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon "Hi all, we are finalized with our testing and are preparing the release pipeline. We will be releasing support for the Qwen3.5, Qwen3.6, and Gemma4 very soon. Alongside the model checkpoints, we will be open-sourcing our complete end-to-end training and evaluation code. Stay… 19 r/LocalLLaMA community 5d ago New deepseek vision model incoming? Hello guys, it seems like DeepSeek added a new vision mode to their application. Does this mean, that they will release a new vision model? Edit: Guys.it is not an OCR model. I have just asked it to describe multiple images, which had no text in them.   submitted by  … 19 Hacker News — AI on Front Page community 5d ago DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf] Article URL: https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf Comments URL: https://news.ycombinator.com/item?id=48696585 Points: 219 # Comments: 43 19 llama.cpp releases dev-tools 5d ago b9823 ci : add windows-openvino to check-release ( #25022 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64… 11 r/LocalLLaMA community 5d ago Are there any qwen finetunes that were genuinely stronger than the base? It's pretty popular to finetune qwen models but I never hear anyone say anything positive about them.   submitted by   /u/MrMrsPotts [link]   [comments] 30 r/LocalLLaMA community 5d ago deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf   submitted by   /u/External_Mood4719 [link]   [comments] 18 Latent.Space news-outlet 5d ago [AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners Oddly tiered releases to both OAI and ANT on the same day. 30 r/LocalLLaMA community 5d ago When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support? I am relatively new here, I have little experience in how long support development takes. I know there are forks. But not merged status means AFAIK that support is far from perfect. When can we expect stable full support for DeepSeek V4 Flash and/or MiniMax M3 in llama.cpp?… 4 TechCrunch — AI news-outlet 5d ago Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies Over 100 companies and government agencies are reportedly authorized to use Mythos 5, including their non-American employees. 12 Hugging Face Daily Papers research 5d ago ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation Abstract ABACUS is a unified vision-language model that performs object counting and related tasks through innovative spatial grounding, boundary-aware counting policies, and self-critical learning strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct ABACUS is a unified… 16 vLLM releases dev-tools 5d ago v0.24.0 [CI] Raise gsm8k startup timeout for MoE Refactor Qwen3 NVFP4 configs… 23 Hacker News — AI on Front Page community 5d ago US allows Anthropic to release Mythos to 'trusted partners' Article URL: https://www.reuters.com/technology/us-releases-anthropic-model-mythos-some-us-companies-semafor-reports-2026-06-26/ Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 207 # Comments: 166 33 Hacker News — AI on Front Page community 5d ago U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations https://archive.md/ArXuF https://www.nbcnews.com/tech/tech-news/us-government-gives-a... Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 336 # Comments: 333 21 Simon Willison community 5d ago Quoting Dean W. Ball This is a bad state of affairs. Consider, in particular, some industry dynamics: Frontier models are trained at an enormous cost, and a significant fraction of that cost is recouped in the few post-release months that they are broadly available. After that period elapses, the… 34 LangChain releases dev-tools 5d ago langchain-anthropic==1.4.8 Changes since langchain-anthropic==1.4.7 release(anthropic): 1.4.8 ( #38490 ) fix(anthropic): keep initial text on content_block_start ( #38442 ) chore: bump langgraph-checkpoint from 4.1.0 to 4.1.1 in /libs/partners/anthropic ( #38479 ) fix(core): add messages to bare raise… 22 r/LocalLLaMA community 5d ago Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction? I tried replacing Google Vision in my receipt pipeline with a local Qwen model. I had an old LINE message bot where I could send a receipt photo, it would go to Google Vision, get parsed into JSON, and saved in SQLite. Recently I tried again, but locally. Setup: RTX 3060 12GB… 8 Hugging Face Daily Papers research 5d ago Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents Abstract Reinforcement learning post-training enables effective step-level scoring for language models without requiring dedicated reward model training by deriving an implicit advantage function called progress advantage. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Process… 6 Hugging Face Daily Papers research 5d ago Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation Abstract A unified agentic framework called Qwen-Image-Agent is proposed to address the context gap in text-to-image generation by progressively constructing complete generation context through planning, reasoning, searching, and memory mechanisms. Generated by… 22 Ollama releases dev-tools 5d ago v0.30.11 What's Changed launch: add thinking capability detection to opencode by @hoyyeva in #15434 launch: auto-install Claude Code by @hoyyeva in #16802 launch: auto-install opencode when missing by @hoyyeva in #16806 discover: fix inverted iGPU/dGPU Vulkan classification on Windows… 28 TechCrunch — AI news-outlet 5d ago OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm “We don’t believe this kind of government access process should become the long-term default,” says OpenAI. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.” 7 Hacker News — AI on Front Page community 5d ago U.S. government will decide who gets to use GPT-5.6 https://archive.ph/PCQQl Comments URL: https://news.ycombinator.com/item?id=48690101 Points: 444 # Comments: 618 11 r/LocalLLaMA community 5d ago Streaming medical STT running locally on a MacBook Quick teaser of what I’ve been working on over the last few weeks: a streaming medical speech-to-text model that runs fully on-device. This demo is running locally on a MacBook through MLX. Still doing more evals, but planning to release the open weights next week.  … 22 llama.cpp releases dev-tools 5d ago b9817 openvino: Update to OV 2026.2.1, self-contained release packages, operator improvements ( #24974 ) Update to OV 2026.2.1, Make OV release packages self-contained Update to OV 2026.2.1, Make OV release packages self-contained OpenVINO Backend: Remove compute_op_type hardcoded… 23 Hacker News — AI on Front Page community 5d ago Previewing GPT‑5.6 Sol: a next-generation model Article URL: https://openai.com/index/previewing-gpt-5-6-sol/ Comments URL: https://news.ycombinator.com/item?id=48689028 Points: 222 # Comments: 199 29 Hugging Face Daily Papers research 5d ago Information-Aware KV Cache Compression for Long Reasoning Abstract InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reasoning capability has advanced rapidly in… 10 r/LocalLLaMA community 6d ago Gemma 4 12b needs glasses Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision. Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher. Even larger overall… 31 Don't Worry About the Vase community 6d ago White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6 We have a new standard policy for releasing frontier AI models. It is not good. 6 Hugging Face Daily Papers research 6d ago EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting Abstract EO-WM is a video diffusion transformer for multispectral Earth Observation forecasting that incorporates physically informed conditioning frameworks to better capture weather-driven uncertainties in land-surface dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 10 Hugging Face Daily Papers research 6d ago LISA: Likelihood Score Alignment for Visual-condition Controllable Generation Abstract Score-based generative modeling reveals that side networks contribute likelihood scores to conditional control, leading to improved training efficiency through likelihood score alignment regularization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The prevalent… 36 r/LocalLLaMA community 6d ago Combined RTX5080 & 4060 for inference ? Hey, I currently use my RTX 4060 8G for inference with Qwen 3.6-35B-A3B Q8 (q8 for everything weight,value,key) max 60k context per agent (for quality over speed, with CPU &DDR4 offloading) but : I only get ~100pp & 20tg at max when context is still low on Qwen 3.6-35B-A3B Q8,… 38 Hugging Face Daily Papers research 6d ago When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models Abstract Multi-model systems face fundamental accuracy limits determined by the rate at which all models fail simultaneously, regardless of their individual correlations or ensemble strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multi-model LLM systems such as routing,… 11 OpenAI official-blog 6d ago Previewing GPT-5.6 Sol: a next-generation model OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack. 10 r/LocalLLaMA community 6d ago Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents Our company recently acquired a workstation with an RTX PRO 6000 Blackwell , and we're experimenting with local LLMs to reduce part of our Claude token usage. Right now we’re running Qwen3.6 27B MTP Q8_K_XL with llama.cpp on Windows 11 . I've been using both Claude Opus and… 13 Hugging Face Daily Papers research 6d ago CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies Abstract CoffeeBench evaluates LLM agents in a multi-agent economic simulation where firms interact over 90 days to maximize profits, revealing differences in communication patterns and performance among various models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As LLM agents… 4 LangChain releases dev-tools 6d ago langchain-fireworks==1.4.3 Changes since langchain-fireworks==1.4.2 release(fireworks): 1.4.3 chore: bump vcrpy from 8.1.1 to 8.2.1 in /libs/partners/fireworks ( #38314 ) chore: bump langsmith from 0.8.16 to 0.8.18 in /libs/partners/fireworks ( #38313 ) chore: bump langsmith from 0.8.14 to 0.8.16 in… 24 r/LocalLLaMA community 6d ago Anyone tried Ornith-1.0 9B? Should I even give it a chance over "qwopus3.5 9b v3.5" or "qwopus3.5 9b coder"? anyone tried it??   submitted by   /u/BothYou243 [link]   [comments] 8 Smol AI News news-outlet 6d ago not much happened today **OpenAI** previewed **GPT-5.6** with three variants: **Sol** (flagship), **Terra** (mid-tier), and **Luna** (lower-cost), launching under a restricted rollout mandated by the U.S. government, limiting access to trusted partners. **Sol** boasts enhanced cybersecurity and safety… 35 r/LocalLLaMA community 6d ago Does llama cpp split mode tensor cause issues? I split qwen 27b and Gemma 4 26b (moe) across a 5080, and 2x 5060ti. I noticed setting split mode to tensor mode will cause looping issues in OpenCode with tool calls or just through the reasoning traces. Anyone else get this or understand why? Split mode layer seems to work… 25 Hugging Face Daily Papers research 6d ago JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting Abstract JetSpec is a speculative decoding framework that combines efficient forward drafting with causal conditioning to improve LLM inference speed and acceptance rates across various benchmarks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Speculative decoding (SD)… 17 Hugging Face Daily Papers research 6d ago Hallucination in World Models is Predictable and Preventable Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render… 25 Hugging Face Daily Papers research 6d ago The Verification Horizon: No Silver Bullet for Coding Agent Rewards Abstract Verification challenges in AI agents arise from the difficulty of aligning proxy signals with human intent, requiring adaptive verification systems that evolve alongside generative capabilities. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A classical intuition holds… 26 arXiv — NLP / Computation & Language research 6d ago HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction arXiv:2606.26744v1 Announce Type: cross Abstract: We present HyperDFlash, a block-parallel speculative decoding framework tailored to the novel multi-hyper-connection (MHC) architecture proposed by DeepSeek-V4. Despite the strong initial-token drafting performance of the native… 10 arXiv — NLP / Computation & Language research 6d ago Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods arXiv:2606.26130v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used to guide research methodology, yet their default methodological tendencies under minimal prompting remain unclear. Here, we prompt GPT-5.1, Gemini 3 Pro, and DeepSeek-V3.2 with an… 38 arXiv — NLP / Computation & Language research 6d ago From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's… 12 arXiv — NLP / Computation & Language research 6d ago Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the… 29 arXiv — NLP / Computation & Language research 6d ago AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems arXiv:2606.26859v1 Announce Type: cross Abstract: Recommendation algorithm iteration is moving from an artisanal, engineer-bound process toward an industrialized research loop, but this transition remains blocked by a structural execution bottleneck: the idea-to-launch cycle… 10 arXiv — NLP / Computation & Language research 6d ago GenRecal: Generation after Recalibration from Large to Small Vision-Language Models arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,… 16 Page 5 of 10 · 500 articles ← Newer Older →