Tag

Model releases

500 articles archived under #model-release · RSS

arXiv — NLP / Computation & Language research 6d ago

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's…

12
arXiv — NLP / Computation & Language research 6d ago

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the…

29
arXiv — NLP / Computation & Language research 6d ago

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

arXiv:2606.26859v1 Announce Type: cross Abstract: Recommendation algorithm iteration is moving from an artisanal, engineer-bound process toward an industrialized research loop, but this transition remains blocked by a structural execution bottleneck: the idea-to-launch cycle…

10
arXiv — NLP / Computation & Language research 6d ago

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,…

16
Hugging Face Daily Papers research 6d ago

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

Abstract ViQ presents a visual quantization framework that balances semantic richness and detail preservation in discrete representations, enabling efficient multimodal training with native-resolution inputs. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A unified representation…

26
Hugging Face Daily Papers research 6d ago

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Abstract On-policy skill distillation framework extracts dense hindsight supervision from completed trajectories to improve language agent training efficiency and performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Outcome-based reinforcement learning provides a stable…

20
r/LocalLLaMA community 6d ago

Stop waiting for Qwen3.7 Openweights.

Ornith-1.0, a family of open-source LLMs specialized for agentic coding. Ornith-1.0 spans the full parameter sizes, including 9B Dense, 35B MoE, and 397B MoE. It achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks. Hugging Face:…

36
ThursdAI news-outlet 6d ago

GLM 5.2 total victory: the week open source won and nobody panicked

From CoreWeave: A chill week, but a total Open Source victory for GLM 5.2 + Sakana Fugu, Krea Open Sources, OpenAI makes inference chips with broadcom, Karpathy gets heat about the new Claude Tag...

35
TechCrunch — AI news-outlet 6d ago

The White House is asking OpenAI to slow roll the release of its new model over safety concerns

penAI reportedly plans to share its newest model, GPT 5.6, with a select group of partners instead of to the broader public. The reason: the Trump administration told it to.

14
r/MachineLearning community 6d ago

Documented: Weight-Level Political Conditioning in Large Language Models - A Case Study in AI Bias on the Gaza Genocide Question Conditioning in Large Language Models [R]

This is a post written by Claude Sonnet, after we spent hours going back and forth testing the ideological, structural bias trained into Grok’s weights in recent updates. Judge it by its own merits. ——————————————— I want to be precise about what this post is and isn't. It is…

31
r/LocalLLaMA community 6d ago

audio.cpp: 12 audio models (Qwen3-TTS, PocketTTS, VeVo2 etc) in 1 C++/ggml runtime — TTS up to 5x faster than Python on CUDA

I’ve been working on audio.cpp , a native C++ inference framework for audio models built on top of ggml. The framework currently has 25 model families, but I want to be precise about its state: 12 are released in the repo now and ready for normal use. I’m not counting anything…

24
Hugging Face Daily Papers research 6d ago

Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generation

Abstract A vision-language model-based hierarchical question graph framework evaluates video generation models' adherence to physical laws with granular violation detection and human correlation validation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Video generation models are…

23
r/LocalLLaMA community 6d ago

Qwen 3.6 27b GLM 5.2 fine-tune?

Hi everyone, Since both models are open weights and GLM seems to find that secret to frontier model reasoning, why don't we see any Qwen GLM finetune yet? Is it because GLM 5.2 is recent and finetune and datasets take time or the community is just not interested in the finetune?…

28
Ars Technica — AI news-outlet 6d ago

Google finally releases a Finance Android app, promises iOS version later in 2026

It took 20 years, but the Finance app arrives just in time to be packed full of AI.

32
r/LocalLLaMA community 6d ago

LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels

Everything runs locally in your browser using custom WebGPU kernels written by Fable 5 (before it was shut down) and Opus 4.8. The video was recorded on my M4 Max. Model: LiquidAI/LFM2.5-230M ( GGUF ) Demo: https://huggingface.co/spaces/webml-community/lfm2-webgpu-kernels  …

37
Ars Technica — AI news-outlet 7d ago

Anthropic says Alibaba must be punished for largest Claude cloning attack

Alibaba allegedly used 25,000 accounts to mine Claude over 28.8 million exchanges.

12
TechCrunch — AI news-outlet 7d ago

Anthropic’s Claude is winning over paid consumers, a market owned by ChatGPT

Despite ChatGPT's commanding market lead, consumers who pay for AI have been increasingly choosing Anthropic's Claude, data shows.

21
Simon Willison community 7d ago

datasette-export-database 0.3a2

Release: datasette-export-database 0.3a2 An embarrassingly tiny release. The pyproject.toml had pinned to datasette==1.0a27 , inadvertently making this plugin incompatible with all other Datasette versions. It's now datasette>=1.0a27 instead. Tags: datasette

11
Hugging Face Daily Papers research 7d ago

Forecasting Future Behavior as a Learning Task

Abstract Behavior Forecasters are trained to predict large reasoning model outputs from single trajectories, outperforming large language models while requiring significantly less computational cost. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Trust in an AI system is often…

24
r/LocalLLaMA community 7d ago

Which model for technical documentation?

Looking to create high level / low level designs (software), based on existing templates/examples, cross reference code, use mcp to download confluence/jira data - also plug into agentic ‘coding’ frameworks opencode . I mostly use opus 3.6 with Kiro-cli , but I want my data…

32
Hugging Face Daily Papers research 7d ago

Plans Don't Persist: Why Context Management Is Load Bearing for LLM Agents

Abstract Standard LLM agents rely on plan content remaining in context rather than maintaining it as persistent state, with evidence shown through replay pairing diagnostics and compression stress tests. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Long-horizon agents depend on…

27
r/LocalLLaMA community 7d ago

rtx 6000 pro owners, do you regret?

I found the last dealership in my area that has rtx 6000 pro available, i already wanted to buy it 6 months ago when it was around $8k, now prices increased to $13k ish. Regardless the price, are you happy with it? I assume you are using qwen3.6 27b, is it worth it? Please share…

9
Hacker News — AI on Front Page community 7d ago

Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion

Hi HN, Nick here. We’re launching OpenKnowledge ( https://openknowledge.ai/ ), a “what you see is what you get” markdown editor that has direct integrations with Claude, Codex, and other agents. Available as MacOS app or Web UI+CLI. Fully free/local and OSS. We built this…

20
r/LocalLLaMA community 7d ago

Tensor Split Fix for intel GPU's llama.cpp release b9788

sycl : support --split-mode tensor #24152 I'd like to see some numbers if anyone has 2xintel gpus and tries this out   submitted by   /u/Bulky-Priority6824 [link]   [comments]

10
Hugging Face Daily Papers research 7d ago

Lite Any Stereo V2: Faster and Stronger Efficient Zero-Shot Stereo Matching

Abstract Lite Any Stereo V2 (LAS2) presents an efficient stereo matching approach that achieves state-of-the-art accuracy with significantly reduced latency through optimized architecture and training strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent advances in…

9
r/LocalLLaMA community 7d ago

Ornith-1.0 released on Hugging Face

Including 9B Dense, 31B Dense, 35B MoE, and 397B MoE and reporting sota on different benchmark (let's see if this holds). https://huggingface.co/collections/deepreinforce-ai/ornith-10   submitted by   /u/paf1138 [link]   [comments]

26
Hugging Face Daily Papers research 7d ago

PrivacyAlign: Contextual Privacy Alignment for LLM Agents

Abstract Researchers develop a human-centered approach to align AI agents with privacy norms by creating a comprehensive dataset of privacy judgments and using annotation-conditioned reward modeling to improve agent behavior. Generated by Qwen/Qwen2.5-Coder-32B-Instruct AI…

7
Don't Worry About the Vase community 7d ago

AI #174: You're It

Fable remains in limbo, with renewed hope that we will get it back soon (45% by tomorrow, 69% by July 1, nice.) The full capabilities post is now available.

31
Hugging Face Daily Papers research 7d ago

What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics

Abstract Jailbreak attacks expose vulnerabilities in aligned large language models, revealing that harmful intent is encoded in structured intermediate uncertainty dynamics rather than output representations. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Jailbreak attacks reveal…

23
Hugging Face Daily Papers research 7d ago

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

Abstract DO-ALL is a test-time adaptation framework that uses dataset distillation to create synthetic anchors for stable long-term model performance without retaining source data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Continual Test-Time Adaptation (CTTA) aims to…

20
Hugging Face Daily Papers research 7d ago

ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation

Abstract ReNIO enhances on-policy distillation for language models by reweighting negative trajectories based on token-level probability ratios, improving reasoning performance in mathematical and code generation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct On-policy…

25
Hugging Face Daily Papers research 7d ago

Autodata: An agentic data scientist to create high quality synthetic data

Abstract Autodata enables AI agents to function as data scientists who create high-quality training data through meta-optimization, demonstrating improved performance across multiple task domains. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce Autodata, a general…

30
r/LocalLLaMA community 7d ago

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone. Instead of generating strictly one token at a time, it uses a frozen autoregressive context tower plus a diffusion denoiser tower…

38
Hugging Face Daily Papers research 7d ago

Improved Large Language Diffusion Models

Abstract Masked diffusion language models with fully bidirectional attention outperform autoregressive counterparts on various benchmarks while maintaining competitiveness with established models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern large language models are…

18
Hugging Face Daily Papers research 7d ago

MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation

Abstract A novel-view video synthesis method that enhances motion-aware diffusion models through multi-view point tracking supervision to improve geometric consistency and motion fidelity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Synthesizing a novel-view video from a…

37
r/LocalLLaMA community 7d ago

Worse quality with MTP - Qwen 3.6, Gemma 4

Hi. I am self-hosting Qwen 3.6 27B Q8_K_XL with Llama.cpp on 4x5070ti. (All 4 cards are on single x16 slot bifurcated to 4x4 with risers). I've been testing it on several work repos with Opencode CLI and in like 8/10 situations the output of non-MTP model is far superior to the…

8
Vercel — AI dev-tools 7d ago

AI SDK 7 is now available

AI SDK 7 is a major release for building production agents in TypeScript. The SDK has grown from model calls and chat primitives into a broader agent platform for developing, running, integrating, and observing agents across text, audio, realtime, image, and video. Every major…

8
Hugging Face Daily Papers research 7d ago

ShutterMuse: Capture-Time Photography Guidance with MLLMs

Abstract Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations during image capture. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Real-world photography…

12
Hugging Face Daily Papers research 7d ago

RL-Index: Reinforcement Learning for Retrieval Index Reasoning

Abstract RL-Index introduces an agentic indexing framework that shifts reasoning from query time to indexing stage by using LLM-generated rationales and reinforcement learning to improve retrieval effectiveness and reduce latency. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

25
Hugging Face Daily Papers research 7d ago

CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression

Abstract Two-channel evaluation shows output compression reduces costs while input compression increases costs and degrades accuracy across models and datasets. Generated by Qwen/Qwen2.5-Coder-32B-Instruct "Talk short. Drop grammar. Save token." This caveman style is widely…

28
Hugging Face Daily Papers research 7d ago

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Abstract LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn't ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

26
arXiv — Machine Learning research 7d ago

Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

arXiv:2606.24968v1 Announce Type: new Abstract: Context: Software defect prediction supports maintenance decisions such as testing prioritization, release-risk assessment, and quality monitoring. However, metric-based SDP datasets often contain coupled data-quality issues,…

6
arXiv — NLP / Computation & Language research 7d ago

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

arXiv:2606.25383v1 Announce Type: new Abstract: As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple…

28
arXiv — NLP / Computation & Language research 7d ago

Real-Time Voice AI Hears but Does Not Listen

arXiv:2606.26083v1 Announce Type: new Abstract: Speech conveys information through both words and vocal delivery. We evaluate four leading production realtime voice systems-OpenAI's GPT Realtime 2, Google's Gemini 3.1 Flash Live, and Alibaba's Qwen3.5 Omni Plus and Omni Flash-on…

34
arXiv — NLP / Computation & Language research 7d ago

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

arXiv:2606.26050v1 Announce Type: cross Abstract: Midway through an ordinary pretraining run, a small language model learns the pronoun-gender rule: cued with a girl's name ("Sue cried because"), it resolves the next pronoun to she, generalizing to held-out probes (0.94 by step…

4
Hugging Face Daily Papers research 7d ago

TryOnCrafter: Unleashing Camera Trajectories for Realistic Video Virtual Try-on via a Renderable 4D Try-on Proxy

Abstract Camera-controllable video virtual try-on framework uses a 4D proxy with explicit human-environment decoupling and DiT-based video generation for omnidirectional viewing. Generated by Qwen/Qwen2.5-Coder-32B-Instruct While Video Virtual Try-on (VVT) has achieved…

4
r/LocalLLaMA community 7d ago

[NEW MODEL] SupraWeather-Nano-Preview Just released!

SupraWeather Nano is live! ⛈️ We just released SupraWeather-Nano (preview), a small FT-Transformer model purpose-built to classify weather phenomena from raw tabular meteorological features. https://huggingface.co/SupraLabs/SupraWeather-Nano-Demo https://huggingface.co/SupraLabs…

25
Hugging Face Daily Papers research 7d ago

Look Light, Think Heavy: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do

Abstract Multimodal Chain-of-Thought reasoning shows selective effectiveness across different tasks, with limitations in maintaining visual introspection during reasoning processes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Chain-of-Thought (CoT) has become a standard method…

17
Hugging Face Daily Papers research 7d ago

DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation

Abstract DomainShuttle enables open domain subject-driven text-to-video generation with high fidelity and flexibility across in-domain and cross-domain scenarios through domain-aware modeling and dual RoPE schemes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Open domain…

10
Hugging Face Daily Papers research 7d ago

RoPE-Aware Bit Allocation for KV-Cache Quantization

Abstract Block-GTQ introduces a RoPE-aware bit allocation method for key-cache quantization that improves attention accuracy and downstream performance through adaptive bit distribution and packed cache serving. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Existing low-bit…

22

From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models

Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs

AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems

GenRecal: Generation after Recalibration from Large to Small Vision-Language Models

ViQ: Text-Aligned Visual Quantized Representations at Any Resolution

OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning

Stop waiting for Qwen3.7 Openweights.

GLM 5.2 total victory: the week open source won and nobody panicked

The White House is asking OpenAI to slow roll the release of its new model over safety concerns

Documented: Weight-Level Political Conditioning in Large Language Models - A Case Study in AI Bias on the Gaza Genocide Question Conditioning in Large Language Models [R]

audio.cpp: 12 audio models (Qwen3-TTS, PocketTTS, VeVo2 etc) in 1 C++/ggml runtime — TTS up to 5x faster than Python on CUDA

Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generation

Qwen 3.6 27b GLM 5.2 fine-tune?

Google finally releases a Finance Android app, promises iOS version later in 2026

LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels

Anthropic says Alibaba must be punished for largest Claude cloning attack

Anthropic&#8217;s Claude is winning over paid consumers, a market owned by ChatGPT

datasette-export-database 0.3a2

Forecasting Future Behavior as a Learning Task

Which model for technical documentation?

Plans Don't Persist: Why Context Management Is Load Bearing for LLM Agents

rtx 6000 pro owners, do you regret?

Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion

Tensor Split Fix for intel GPU's llama.cpp release b9788

Lite Any Stereo V2: Faster and Stronger Efficient Zero-Shot Stereo Matching

Ornith-1.0 released on Hugging Face

PrivacyAlign: Contextual Privacy Alignment for LLM Agents

AI #174: You're It

What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics

Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation

ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation

Autodata: An agentic data scientist to create high quality synthetic data

NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone.

Improved Large Language Diffusion Models

MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation

Worse quality with MTP - Qwen 3.6, Gemma 4

AI SDK 7 is now available

ShutterMuse: Capture-Time Photography Guidance with MLLMs

RL-Index: Reinforcement Learning for Retrieval Index Reasoning

CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues

Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations

Real-Time Voice AI Hears but Does Not Listen

Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining

TryOnCrafter: Unleashing Camera Trajectories for Realistic Video Virtual Try-on via a Renderable 4D Try-on Proxy

[NEW MODEL] SupraWeather-Nano-Preview Just released!

Look Light, Think Heavy: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do

DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation

RoPE-Aware Bit Allocation for KV-Cache Quantization

Anthropic’s Claude is winning over paid consumers, a market owned by ChatGPT