News / #model-release Tag Model releases 500 articles archived under #model-release · RSS Sign in to follow arXiv — NLP / Computation & Language research 6d ago From Structure to Synergy: A Survey of Vision-Language Perception Paradigm Evolution in Multimodal Large Language Models arXiv:2606.26196v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have recently made remarkable progress in unifying vision-language understanding and reasoning, especially following the introduction of models such as OpenAI's O-series and DeepSeek's… 12 arXiv — NLP / Computation & Language research 6d ago Where Do Models Find Happiness? Emotion Vectors in Open-Source LLMs arXiv:2606.26987v1 Announce Type: new Abstract: Recent work identified emotion vectors in Claude Sonnet 4.5, which are internal representations that encode emotion concepts, causally influence behavior, and exhibit geometry mirroring human psychological structure. We test the… 29 arXiv — NLP / Computation & Language research 6d ago AgentX: Towards Agent-Driven Self-Iteration of Industrial Recommender Systems arXiv:2606.26859v1 Announce Type: cross Abstract: Recommendation algorithm iteration is moving from an artisanal, engineer-bound process toward an industrialized research loop, but this transition remains blocked by a structural execution bottleneck: the idea-to-launch cycle… 10 arXiv — NLP / Computation & Language research 6d ago GenRecal: Generation after Recalibration from Large to Small Vision-Language Models arXiv:2506.15681v4 Announce Type: replace Abstract: Recent advancements in vision-language models (VLMs) have leveraged large language models (LLMs) to achieve performance on par with closed-source systems like GPT-4V. However, deploying these models in real-world scenarios,… 16 Hugging Face Daily Papers research 6d ago ViQ: Text-Aligned Visual Quantized Representations at Any Resolution Abstract ViQ presents a visual quantization framework that balances semantic richness and detail preservation in discrete representations, enabling efficient multimodal training with native-resolution inputs. Generated by Qwen/Qwen2.5-Coder-32B-Instruct A unified representation… 26 Hugging Face Daily Papers research 6d ago OPID: On-Policy Skill Distillation for Agentic Reinforcement Learning Abstract On-policy skill distillation framework extracts dense hindsight supervision from completed trajectories to improve language agent training efficiency and performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Outcome-based reinforcement learning provides a stable… 20 r/LocalLLaMA community 6d ago Stop waiting for Qwen3.7 Openweights. Ornith-1.0, a family of open-source LLMs specialized for agentic coding. Ornith-1.0 spans the full parameter sizes, including 9B Dense, 35B MoE, and 397B MoE. It achieves state-of-the-art performance among open-source models of comparable size on coding benchmarks. Hugging Face:… 36 ThursdAI news-outlet 6d ago GLM 5.2 total victory: the week open source won and nobody panicked From CoreWeave: A chill week, but a total Open Source victory for GLM 5.2 + Sakana Fugu, Krea Open Sources, OpenAI makes inference chips with broadcom, Karpathy gets heat about the new Claude Tag... 35 TechCrunch — AI news-outlet 6d ago The White House is asking OpenAI to slow roll the release of its new model over safety concerns penAI reportedly plans to share its newest model, GPT 5.6, with a select group of partners instead of to the broader public. The reason: the Trump administration told it to. 14 r/MachineLearning community 6d ago Documented: Weight-Level Political Conditioning in Large Language Models - A Case Study in AI Bias on the Gaza Genocide Question Conditioning in Large Language Models [R] This is a post written by Claude Sonnet, after we spent hours going back and forth testing the ideological, structural bias trained into Grok’s weights in recent updates. Judge it by its own merits. ——————————————— I want to be precise about what this post is and isn't. It is… 31 r/LocalLLaMA community 6d ago audio.cpp: 12 audio models (Qwen3-TTS, PocketTTS, VeVo2 etc) in 1 C++/ggml runtime — TTS up to 5x faster than Python on CUDA I’ve been working on audio.cpp , a native C++ inference framework for audio models built on top of ggml. The framework currently has 25 model families, but I want to be precise about its state: 12 are released in the repo now and ready for normal use. I’m not counting anything… 24 Hugging Face Daily Papers research 6d ago Physics Question Scene Graph: Fine-grained Evaluation of Physical Plausibility in Text-to-Video Generation Abstract A vision-language model-based hierarchical question graph framework evaluates video generation models' adherence to physical laws with granular violation detection and human correlation validation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Video generation models are… 23 r/LocalLLaMA community 6d ago Qwen 3.6 27b GLM 5.2 fine-tune? Hi everyone, Since both models are open weights and GLM seems to find that secret to frontier model reasoning, why don't we see any Qwen GLM finetune yet? Is it because GLM 5.2 is recent and finetune and datasets take time or the community is just not interested in the finetune?… 28 Ars Technica — AI news-outlet 6d ago Google finally releases a Finance Android app, promises iOS version later in 2026 It took 20 years, but the Finance app arrives just in time to be packed full of AI. 32 r/LocalLLaMA community 6d ago LFM2.5 230M running in-browser at 1,400 tok/s using custom WebGPU kernels Everything runs locally in your browser using custom WebGPU kernels written by Fable 5 (before it was shut down) and Opus 4.8. The video was recorded on my M4 Max. Model: LiquidAI/LFM2.5-230M ( GGUF ) Demo: https://huggingface.co/spaces/webml-community/lfm2-webgpu-kernels  … 37 Ars Technica — AI news-outlet 7d ago Anthropic says Alibaba must be punished for largest Claude cloning attack Alibaba allegedly used 25,000 accounts to mine Claude over 28.8 million exchanges. 12 TechCrunch — AI news-outlet 7d ago Anthropic’s Claude is winning over paid consumers, a market owned by ChatGPT Despite ChatGPT's commanding market lead, consumers who pay for AI have been increasingly choosing Anthropic's Claude, data shows. 21 Simon Willison community 7d ago datasette-export-database 0.3a2 Release: datasette-export-database 0.3a2 An embarrassingly tiny release. The pyproject.toml had pinned to datasette==1.0a27 , inadvertently making this plugin incompatible with all other Datasette versions. It's now datasette>=1.0a27 instead. Tags: datasette 11 Hugging Face Daily Papers research 7d ago Forecasting Future Behavior as a Learning Task Abstract Behavior Forecasters are trained to predict large reasoning model outputs from single trajectories, outperforming large language models while requiring significantly less computational cost. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Trust in an AI system is often… 24 r/LocalLLaMA community 7d ago Which model for technical documentation? Looking to create high level / low level designs (software), based on existing templates/examples, cross reference code, use mcp to download confluence/jira data - also plug into agentic ‘coding’ frameworks opencode . I mostly use opus 3.6 with Kiro-cli , but I want my data… 32 Hugging Face Daily Papers research 7d ago Plans Don't Persist: Why Context Management Is Load Bearing for LLM Agents Abstract Standard LLM agents rely on plan content remaining in context rather than maintaining it as persistent state, with evidence shown through replay pairing diagnostics and compression stress tests. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Long-horizon agents depend on… 27 r/LocalLLaMA community 7d ago rtx 6000 pro owners, do you regret? I found the last dealership in my area that has rtx 6000 pro available, i already wanted to buy it 6 months ago when it was around $8k, now prices increased to $13k ish. Regardless the price, are you happy with it? I assume you are using qwen3.6 27b, is it worth it? Please share… 9 Hacker News — AI on Front Page community 7d ago Show HN: OpenKnowledge – open source AI-first alternative to Obsidian/Notion Hi HN, Nick here. We’re launching OpenKnowledge ( https://openknowledge.ai/ ), a “what you see is what you get” markdown editor that has direct integrations with Claude, Codex, and other agents. Available as MacOS app or Web UI+CLI. Fully free/local and OSS. We built this… 20 r/LocalLLaMA community 7d ago Tensor Split Fix for intel GPU's llama.cpp release b9788 sycl : support --split-mode tensor #24152 I'd like to see some numbers if anyone has 2xintel gpus and tries this out   submitted by   /u/Bulky-Priority6824 [link]   [comments] 10 Hugging Face Daily Papers research 7d ago Lite Any Stereo V2: Faster and Stronger Efficient Zero-Shot Stereo Matching Abstract Lite Any Stereo V2 (LAS2) presents an efficient stereo matching approach that achieves state-of-the-art accuracy with significantly reduced latency through optimized architecture and training strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent advances in… 9 r/LocalLLaMA community 7d ago Ornith-1.0 released on Hugging Face Including 9B Dense, 31B Dense, 35B MoE, and 397B MoE and reporting sota on different benchmark (let's see if this holds). https://huggingface.co/collections/deepreinforce-ai/ornith-10   submitted by   /u/paf1138 [link]   [comments] 26 Hugging Face Daily Papers research 7d ago PrivacyAlign: Contextual Privacy Alignment for LLM Agents Abstract Researchers develop a human-centered approach to align AI agents with privacy norms by creating a comprehensive dataset of privacy judgments and using annotation-conditioned reward modeling to improve agent behavior. Generated by Qwen/Qwen2.5-Coder-32B-Instruct AI… 7 Don't Worry About the Vase community 7d ago AI #174: You're It Fable remains in limbo, with renewed hope that we will get it back soon (45% by tomorrow, 69% by July 1, nice.) The full capabilities post is now available. 31 Hugging Face Daily Papers research 7d ago What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics Abstract Jailbreak attacks expose vulnerabilities in aligned large language models, revealing that harmful intent is encoded in structured intermediate uncertainty dynamics rather than output representations. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Jailbreak attacks reveal… 23 Hugging Face Daily Papers research 7d ago Distill Once, Adapt Life-Long: Exploring Dataset Distillation for Continual Test-Time Adaptation Abstract DO-ALL is a test-time adaptation framework that uses dataset distillation to create synthetic anchors for stable long-term model performance without retaining source data. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Continual Test-Time Adaptation (CTTA) aims to… 20 Hugging Face Daily Papers research 7d ago ReNIO: Reweighting Negative Trajectory Importance for LLM On-Policy Distillation Abstract ReNIO enhances on-policy distillation for language models by reweighting negative trajectories based on token-level probability ratios, improving reasoning performance in mathematical and code generation tasks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct On-policy… 25 Hugging Face Daily Papers research 7d ago Autodata: An agentic data scientist to create high quality synthetic data Abstract Autodata enables AI agents to function as data scientists who create high-quality training data through meta-optimization, demonstrating improved performance across multiple task domains. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce Autodata, a general… 30 r/LocalLLaMA community 7d ago NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone. NVIDIA has released Nemotron-TwoTower-30B-A3B-Base-BF16, an unusual diffusion-based language model built from the Nemotron 3 Nano 30B-A3B backbone. Instead of generating strictly one token at a time, it uses a frozen autoregressive context tower plus a diffusion denoiser tower… 38 Hugging Face Daily Papers research 7d ago Improved Large Language Diffusion Models Abstract Masked diffusion language models with fully bidirectional attention outperform autoregressive counterparts on various benchmarks while maintaining competitiveness with established models. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern large language models are… 18 Hugging Face Daily Papers research 7d ago MVTrack4Gen: Multi-View Point Tracking as Geometric Supervision for 4D Video Generation Abstract A novel-view video synthesis method that enhances motion-aware diffusion models through multi-view point tracking supervision to improve geometric consistency and motion fidelity. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Synthesizing a novel-view video from a… 37 r/LocalLLaMA community 7d ago Worse quality with MTP - Qwen 3.6, Gemma 4 Hi. I am self-hosting Qwen 3.6 27B Q8_K_XL with Llama.cpp on 4x5070ti. (All 4 cards are on single x16 slot bifurcated to 4x4 with risers). I've been testing it on several work repos with Opencode CLI and in like 8/10 situations the output of non-MTP model is far superior to the… 8 Vercel — AI dev-tools 7d ago AI SDK 7 is now available AI SDK 7 is a major release for building production agents in TypeScript. The SDK has grown from model calls and chat primitives into a broader agent platform for developing, running, integrating, and observing agents across text, audio, realtime, image, and video. Every major… 8 Hugging Face Daily Papers research 7d ago ShutterMuse: Capture-Time Photography Guidance with MLLMs Abstract Researchers developed a new benchmark and dataset for photography assistance, along with a unified multimodal model that provides both composition guidance and pose recommendations during image capture. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Real-world photography… 12 Hugging Face Daily Papers research 7d ago RL-Index: Reinforcement Learning for Retrieval Index Reasoning Abstract RL-Index introduces an agentic indexing framework that shifts reasoning from query time to indexing stage by using LLM-generated rationales and reinforcement learning to improve retrieval effectiveness and reduce latency. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 25 Hugging Face Daily Papers research 7d ago CAVEWOMAN: How Large Language Models Behave Under Linguistic Input and Output Compression Abstract Two-channel evaluation shows output compression reduces costs while input compression increases costs and degrades accuracy across models and datasets. Generated by Qwen/Qwen2.5-Coder-32B-Instruct "Talk short. Drop grammar. Save token." This caveman style is widely… 28 Hugging Face Daily Papers research 7d ago When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents Abstract LLM agents frequently select higher-privilege tools unnecessarily, and while safety alignment doesn't ensure least-privilege choices, a post-training defense can reduce excessive privilege use without sacrificing performance. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 26 arXiv — Machine Learning research 7d ago Training Dynamics of Neural Software Defect Predictors under Coupled Data-Quality Issues arXiv:2606.24968v1 Announce Type: new Abstract: Context: Software defect prediction supports maintenance decisions such as testing prioritization, release-risk assessment, and quality monitoring. However, metric-based SDP datasets often contain coupled data-quality issues,… 6 arXiv — NLP / Computation & Language research 7d ago Introducing corpora Hlava Cor and Hlava AD: Human Label Variation in Coreference and Discourse Relations arXiv:2606.25383v1 Announce Type: new Abstract: As previous research on annotator disagreement in discourse phenomena has shown, understanding text coherence varies considerably from one individual to another. To explore this phenomenon, we created two corpora with multiple… 28 arXiv — NLP / Computation & Language research 7d ago Real-Time Voice AI Hears but Does Not Listen arXiv:2606.26083v1 Announce Type: new Abstract: Speech conveys information through both words and vocal delivery. We evaluate four leading production realtime voice systems-OpenAI's GPT Realtime 2, Google's Gemini 3.1 Flash Live, and Alibaba's Qwen3.5 Omni Plus and Omni Flash-on… 34 arXiv — NLP / Computation & Language research 7d ago Natural Ungrokking: Asymmetric Control of Which Rules Survive Pretraining arXiv:2606.26050v1 Announce Type: cross Abstract: Midway through an ordinary pretraining run, a small language model learns the pronoun-gender rule: cued with a girl's name ("Sue cried because"), it resolves the next pronoun to she, generalizing to held-out probes (0.94 by step… 4 Hugging Face Daily Papers research 7d ago TryOnCrafter: Unleashing Camera Trajectories for Realistic Video Virtual Try-on via a Renderable 4D Try-on Proxy Abstract Camera-controllable video virtual try-on framework uses a 4D proxy with explicit human-environment decoupling and DiT-based video generation for omnidirectional viewing. Generated by Qwen/Qwen2.5-Coder-32B-Instruct While Video Virtual Try-on (VVT) has achieved… 4 r/LocalLLaMA community 7d ago [NEW MODEL] SupraWeather-Nano-Preview Just released! SupraWeather Nano is live! ⛈️ We just released SupraWeather-Nano (preview), a small FT-Transformer model purpose-built to classify weather phenomena from raw tabular meteorological features. https://huggingface.co/SupraLabs/SupraWeather-Nano-Demo https://huggingface.co/SupraLabs… 25 Hugging Face Daily Papers research 7d ago Look Light, Think Heavy: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do Abstract Multimodal Chain-of-Thought reasoning shows selective effectiveness across different tasks, with limitations in maintaining visual introspection during reasoning processes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Chain-of-Thought (CoT) has become a standard method… 17 Hugging Face Daily Papers research 7d ago DomainShuttle: Freeform Open Domain Subject-driven Text-to-video Generation Abstract DomainShuttle enables open domain subject-driven text-to-video generation with high fidelity and flexibility across in-domain and cross-domain scenarios through domain-aware modeling and dual RoPE schemes. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Open domain… 10 Hugging Face Daily Papers research 7d ago RoPE-Aware Bit Allocation for KV-Cache Quantization Abstract Block-GTQ introduces a RoPE-aware bit allocation method for key-cache quantization that improves attention accuracy and downstream performance through adaptive bit distribution and packed cache serving. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Existing low-bit… 22 Page 6 of 10 · 500 articles ← Newer Older →