Tag

Model releases

500 articles archived under #model-release · RSS

TechCrunch — AI news-outlet 5d ago

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market.

29
r/LocalLLaMA community 5d ago

We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant)

Hey everyone, We just released our first release candidate from Spectral Labs: a Qwen3.5 0.8B Q4_K_M built using a new calibration-aware quantization approach we're calling SpectralQuant . The goal here was to see if we could make a standard Q4_K_M footprint behave more like a…

15
Ahead of AI (Sebastian Raschka) research 5d ago

Using Local Coding Agents

Using Open-Weight Models in Local Coding Harnesses as an Alternative to Claude Code and Codex Subscriptions

17
r/LocalLLaMA community 5d ago

Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon

"Hi all, we are finalized with our testing and are preparing the release pipeline. We will be releasing support for the Qwen3.5, Qwen3.6, and Gemma4 very soon. Alongside the model checkpoints, we will be open-sourcing our complete end-to-end training and evaluation code. Stay…

19
r/LocalLLaMA community 5d ago

New deepseek vision model incoming?

Hello guys, it seems like DeepSeek added a new vision mode to their application. Does this mean, that they will release a new vision model? Edit: Guys.it is not an OCR model. I have just asked it to describe multiple images, which had no text in them.   submitted by  …

19
Hacker News — AI on Front Page community 5d ago

DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf]

Article URL: https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf Comments URL: https://news.ycombinator.com/item?id=48696585 Points: 219 # Comments: 43

19
llama.cpp releases dev-tools 5d ago

b9823

ci : add windows-openvino to check-release ( #25022 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

11
r/LocalLLaMA community 5d ago

Are there any qwen finetunes that were genuinely stronger than the base?

It's pretty popular to finetune qwen models but I never hear anyone say anything positive about them.   submitted by   /u/MrMrsPotts [link]   [comments]

30
r/LocalLLaMA community 5d ago

deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro-DSpark https://github.com/deepseek-ai/DeepSpec/blob/main/DSpark_paper.pdf   submitted by   /u/External_Mood4719 [link]   [comments]

18
Latent.Space news-outlet 5d ago

[AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners

Oddly tiered releases to both OAI and ANT on the same day.

30
r/LocalLLaMA community 5d ago

When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support?

I am relatively new here, I have little experience in how long support development takes. I know there are forks. But not merged status means AFAIK that support is far from perfect. When can we expect stable full support for DeepSeek V4 Flash and/or MiniMax M3 in llama.cpp?…

4
TechCrunch — AI news-outlet 5d ago

Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies

Over 100 companies and government agencies are reportedly authorized to use Mythos 5, including their non-American employees.

12
Hugging Face Daily Papers research 5d ago

ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation

Abstract ABACUS is a unified vision-language model that performs object counting and related tasks through innovative spatial grounding, boundary-aware counting policies, and self-critical learning strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct ABACUS is a unified…

16
vLLM releases dev-tools 5d ago

v0.24.0

[CI] Raise gsm8k startup timeout for MoE Refactor Qwen3 NVFP4 configs…

23
Hacker News — AI on Front Page community 5d ago

US allows Anthropic to release Mythos to 'trusted partners'

Article URL: https://www.reuters.com/technology/us-releases-anthropic-model-mythos-some-us-companies-semafor-reports-2026-06-26/ Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 207 # Comments: 166

33
Hacker News — AI on Front Page community 5d ago

U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations

https://archive.md/ArXuF https://www.nbcnews.com/tech/tech-news/us-government-gives-a... Comments URL: https://news.ycombinator.com/item?id=48692995 Points: 336 # Comments: 333

21
Simon Willison community 5d ago

Quoting Dean W. Ball

This is a bad state of affairs. Consider, in particular, some industry dynamics: Frontier models are trained at an enormous cost, and a significant fraction of that cost is recouped in the few post-release months that they are broadly available. After that period elapses, the…

34
LangChain releases dev-tools 5d ago

langchain-anthropic==1.4.8

Changes since langchain-anthropic==1.4.7 release(anthropic): 1.4.8 ( #38490 ) fix(anthropic): keep initial text on content_block_start ( #38442 ) chore: bump langgraph-checkpoint from 4.1.0 to 4.1.1 in /libs/partners/anthropic ( #38479 ) fix(core): add messages to bare raise…

22
r/LocalLLaMA community 5d ago

Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction?

I tried replacing Google Vision in my receipt pipeline with a local Qwen model. I had an old LINE message bot where I could send a receipt photo, it would go to Google Vision, get parsed into JSON, and saved in SQLite. Recently I tried again, but locally. Setup: RTX 3060 12GB…

8
Hugging Face Daily Papers research 5d ago

Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents

Abstract Reinforcement learning post-training enables effective step-level scoring for language models without requiring dedicated reward model training by deriving an implicit advantage function called progress advantage. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Process…

6
Hugging Face Daily Papers research 5d ago

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

Abstract A unified agentic framework called Qwen-Image-Agent is proposed to address the context gap in text-to-image generation by progressively constructing complete generation context through planning, reasoning, searching, and memory mechanisms. Generated by…

22
Ollama releases dev-tools 5d ago

v0.30.11

What's Changed launch: add thinking capability detection to opencode by @hoyyeva in #15434 launch: auto-install Claude Code by @hoyyeva in #16802 launch: auto-install opencode when missing by @hoyyeva in #16806 discover: fix inverted iGPU/dGPU Vulkan classification on Windows…

28
TechCrunch — AI news-outlet 5d ago

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm

“We don’t believe this kind of government access process should become the long-term default,” says OpenAI. “It keeps the best tools from users, developers, enterprises, cyber defenders, and global partners who need them.”

7
Hacker News — AI on Front Page community 5d ago

U.S. government will decide who gets to use GPT-5.6

https://archive.ph/PCQQl Comments URL: https://news.ycombinator.com/item?id=48690101 Points: 444 # Comments: 618

11
r/LocalLLaMA community 5d ago

Streaming medical STT running locally on a MacBook

Quick teaser of what I’ve been working on over the last few weeks: a streaming medical speech-to-text model that runs fully on-device. This demo is running locally on a MacBook through MLX. Still doing more evals, but planning to release the open weights next week.  …

22
llama.cpp releases dev-tools 5d ago

b9817

openvino: Update to OV 2026.2.1, self-contained release packages, operator improvements ( #24974 ) Update to OV 2026.2.1, Make OV release packages self-contained Update to OV 2026.2.1, Make OV release packages self-contained OpenVINO Backend: Remove compute_op_type hardcoded…

23
Hacker News — AI on Front Page community 5d ago

Previewing GPT‑5.6 Sol: a next-generation model

Article URL: https://openai.com/index/previewing-gpt-5-6-sol/ Comments URL: https://news.ycombinator.com/item?id=48689028 Points: 222 # Comments: 199

29
Hugging Face Daily Papers research 5d ago

Information-Aware KV Cache Compression for Long Reasoning

Abstract InfoKV is an entropy-aware KV cache compression framework that enhances long-context reasoning in LLMs by incorporating information-theoretic signals alongside attention weights. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Reasoning capability has advanced rapidly in…

10
r/LocalLLaMA community 6d ago

Gemma 4 12b needs glasses

Having a lot of fun using Gemma 4 as an assistant, but is growing frustrated with the poor default image resolution setting for image vision. Tasks like identifying smaller text in an image that Qwen 3.6 flies through, Gemma 4 are never able to decipher. Even larger overall…

31
Don't Worry About the Vase community 6d ago

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

We have a new standard policy for releasing frontier AI models. It is not good.

6
Hugging Face Daily Papers research 6d ago

EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting

Abstract EO-WM is a video diffusion transformer for multispectral Earth Observation forecasting that incorporates physically informed conditioning frameworks to better capture weather-driven uncertainties in land-surface dynamics. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

10
Hugging Face Daily Papers research 6d ago

LISA: Likelihood Score Alignment for Visual-condition Controllable Generation

Abstract Score-based generative modeling reveals that side networks contribute likelihood scores to conditional control, leading to improved training efficiency through likelihood score alignment regularization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct The prevalent…

36
r/LocalLLaMA community 6d ago

Combined RTX5080 & 4060 for inference ?

Hey, I currently use my RTX 4060 8G for inference with Qwen 3.6-35B-A3B Q8 (q8 for everything weight,value,key) max 60k context per agent (for quality over speed, with CPU &DDR4 offloading) but : I only get ~100pp & 20tg at max when context is still low on Qwen 3.6-35B-A3B Q8,…

38
Hugging Face Daily Papers research 6d ago

When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models

Abstract Multi-model systems face fundamental accuracy limits determined by the rate at which all models fail simultaneously, regardless of their individual correlations or ensemble strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Multi-model LLM systems such as routing,…

11
OpenAI official-blog 6d ago

Previewing GPT-5.6 Sol: a next-generation model

OpenAI previews GPT-5.6 Sol, a next-generation model with stronger capabilities in coding, science, and cybersecurity, paired with its most advanced safety stack.

10
r/LocalLLaMA community 6d ago

Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents

Our company recently acquired a workstation with an RTX PRO 6000 Blackwell , and we're experimenting with local LLMs to reduce part of our Claude token usage. Right now we’re running Qwen3.6 27B MTP Q8_K_XL with llama.cpp on Windows 11 . I've been using both Claude Opus and…

13
Hugging Face Daily Papers research 6d ago

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

Abstract CoffeeBench evaluates LLM agents in a multi-agent economic simulation where firms interact over 90 days to maximize profits, revealing differences in communication patterns and performance among various models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct As LLM agents…

4
LangChain releases dev-tools 6d ago

langchain-fireworks==1.4.3

Changes since langchain-fireworks==1.4.2 release(fireworks): 1.4.3 chore: bump vcrpy from 8.1.1 to 8.2.1 in /libs/partners/fireworks ( #38314 ) chore: bump langsmith from 0.8.16 to 0.8.18 in /libs/partners/fireworks ( #38313 ) chore: bump langsmith from 0.8.14 to 0.8.16 in…

24
r/LocalLLaMA community 6d ago

Anyone tried Ornith-1.0 9B?

Should I even give it a chance over "qwopus3.5 9b v3.5" or "qwopus3.5 9b coder"? anyone tried it??   submitted by   /u/BothYou243 [link]   [comments]

8
Smol AI News news-outlet 6d ago

not much happened today

**OpenAI** previewed **GPT-5.6** with three variants: **Sol** (flagship), **Terra** (mid-tier), and **Luna** (lower-cost), launching under a restricted rollout mandated by the U.S. government, limiting access to trusted partners. **Sol** boasts enhanced cybersecurity and safety…

35
r/LocalLLaMA community 6d ago

Does llama cpp split mode tensor cause issues?

I split qwen 27b and Gemma 4 26b (moe) across a 5080, and 2x 5060ti. I noticed setting split mode to tensor mode will cause looping issues in OpenCode with tool calls or just through the reasoning traces. Anyone else get this or understand why? Split mode layer seems to work…

25
Hugging Face Daily Papers research 6d ago

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Abstract JetSpec is a speculative decoding framework that combines efficient forward drafting with causal conditioning to improve LLM inference speed and acceptance rates across various benchmarks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Speculative decoding (SD)…

17
Hugging Face Daily Papers research 6d ago

Hallucination in World Models is Predictable and Preventable

Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render…

25
Hugging Face Daily Papers research 6d ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

Abstract Verification challenges in AI agents arise from the difficulty of aligning proxy signals with human intent, requiring adaptive verification systems that evolve alongside generative capabilities. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A classical intuition holds…

26
arXiv — NLP / Computation & Language research 6d ago

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

arXiv:2606.26744v1 Announce Type: cross Abstract: We present HyperDFlash, a block-parallel speculative decoding framework tailored to the novel multi-hyper-connection (MHC) architecture proposed by DeepSeek-V4. Despite the strong initial-token drafting performance of the native…

10
arXiv — NLP / Computation & Language research 6d ago

Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods

arXiv:2606.26130v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly used to guide research methodology, yet their default methodological tendencies under minimal prompting remain unclear. Here, we prompt GPT-5.1, Gemini 3 Pro, and DeepSeek-V3.2 with an…

38
arXiv — NLP / Computation & Language research 6d ago

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's…

12
arXiv — NLP / Computation & Language research 6d ago

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the…

29
arXiv — NLP / Computation & Language research 6d ago

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

arXiv:2606.26859v1 Announce Type: cross Abstract: Recommendation algorithm iteration is moving from an artisanal, engineer-bound process toward an industrialized research loop, but this transition remains blocked by a structural execution bottleneck: the idea-to-launch cycle…

10
arXiv — NLP / Computation & Language research 6d ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,…

16

Asian AI startups launch Mythos-like models as Anthropic&#8217;s export ban drags on

We built a calibration-aware Q4_K_M quant of Qwen3.5 0.8B that recovers 96.5% of the BF16 gap vs pure llama.cpp Q4_K_M (SpectralQuant)

Using Local Coding Agents

Orthrus (diffusion head) trained Qwen 3.5/3.6 and Gemma 4 models are dropping soon

New deepseek vision model incoming?

DeepSeek open-sources inference optimizations with 60–85% faster generation [pdf]

b9823

Are there any qwen finetunes that were genuinely stronger than the base?

deepseek-ai/DeepSeek-V4-Pro-DSpark • Huggingface

[AINews] OpenAI GPT-5.6 Sol / Terra / Luna — restricted to trusted partners

When can we expect merged DeepSeek V4 Flash / MiniMax M3 llama.cpp support?

Trump Admin releases Anthropic Mythos to be used by more than 100 US companies, agencies

ABACUS: Adapting Unified Foundation Model for Bridging Image Count Understanding and Generation

v0.24.0

US allows Anthropic to release Mythos to 'trusted partners'

U.S. allows Anthropic to release Mythos AI to ‘trusted’ US organizations

Quoting Dean W. Ball

langchain-anthropic==1.4.8

Can Qwen3.6-35B-A3B on an RTX 3060 Replace Google Vision for Receipt-to-JSON Extraction?

Neglected Free Lunch from Post-training: Progress Advantage for LLM Agents

Qwen-Image-Agent: Bridging the Context Gap in Real-World Image Generation

v0.30.11

OpenAI limits GPT-5.6 rollout after government request, says restrictions shouldn’t be the norm

U.S. government will decide who gets to use GPT-5.6

Streaming medical STT running locally on a MacBook

b9817

Previewing GPT‑5.6 Sol: a next-generation model

Information-Aware KV Cache Compression for Long Reasoning

Gemma 4 12b needs glasses

White House Will Ad Hoc Decide Who Can Individually Access GPT-5.6

EO-WM: A Physically Informed World Model for Probabilistic Earth Observation Forecasting

LISA: Likelihood Score Alignment for Visual-condition Controllable Generation

Combined RTX5080 & 4060 for inference ?

When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models

Previewing GPT-5.6 Sol: a next-generation model

Help optimizing llama.cpp + Qwen 27B on RTX PRO 6000 Blackwell for coding agents

CoffeeBench: Benchmarking Long-Horizon LLM Agents in Heterogeneous Multi-Agent Economies

langchain-fireworks==1.4.3

Anyone tried Ornith-1.0 9B?

not much happened today

Does llama cpp split mode tensor cause issues?

JetSpec: Breaking the Scaling Ceiling of Speculative Decoding with Parallel Tree Drafting

Hallucination in World Models is Predictable and Preventable

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

HyperDFlash: MHC-Aligned Block Speculative Decoding with Gated Residual Reduction

Thinking Like a Scientist? A Structural Study of LLM-Generated Research Methods

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on