News / #model-release Tag Model releases 500 articles archived under #model-release · RSS Sign in to follow Hugging Face Daily Papers research 2d ago TheoremGraph: Bridging Formal and Informal Mathematics Abstract A unified mathematical dependency graph connects informal and formal mathematics through semantic embedding and automated extraction from arXiv papers and Lean projects. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Mathematical knowledge is organized around statements… 32 Hugging Face Daily Papers research 2d ago Learning Transferable Dynamics Priors from Action to World Modeling Abstract Action-conditioned world modeling enables transferable dynamics priors for robot learning through pretraining on large-scale manipulation data, supporting both simulator-based policy evaluation and video-action prediction. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We… 27 Hugging Face Daily Papers research 2d ago The Surprising Effectiveness of Video Diffusion Models for Hand Motion Reconstruction Abstract ViDiHand uses pretrained video diffusion model representations with hand-overlay rendering to reconstruct 4D hand motion directly from video frames without detectors or optimization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct 4D hand motion reconstruction from… 31 r/LocalLLaMA community 2d ago Tesla V100 16GB local LLMs, single and dual NVLink benchmarks Picked up a couple of Tesla V100-SXM2-16GB modules a while back to run local models and drive Claude Code fully offline, figured the actual numbers and the traps might save someone else the pain. They've come right down in price and the 16GB of HBM2 at ~900 GB/s still holds up… 33 Hugging Face Daily Papers research 2d ago Interleaved Speech Language Models Latently Work In Text Abstract Interleaved speech-text language models exhibit an implicit transcription phase where text tokens become decodable in intermediate layers, followed by text-based prediction before speech domain transformation. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Speech language… 16 Smol AI News news-outlet 2d ago not much happened today **Anthropic** launched **Claude Sonnet 5** as its new default mid-tier frontier model, featuring a **1M-token context window**, enhanced agentic capabilities including planning, browser and terminal tool use, and autonomous execution previously requiring larger models. The model… 27 Hugging Face Daily Papers research 2d ago Video-MME-Logical: A Controlled Diagnostic Benchmark for Video Temporal-Logical Reasoning Abstract A new benchmark evaluates multimodal large language models' ability to reason over dynamic visual evidence through controlled temporal-logical operations rather than simple object recognition. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent interest in multimodal… 25 r/LocalLLaMA community 2d ago Anyone using Gemma4:31b over Qwen3.6:27b or 35b(a10) Using them in opencode. Mainly writing python scripts to set up workflows. I really do like Gemma4 even though it just sometimes doesn’t want to go the extra length. I really have to end up pushing it. It’s like really stubborn or something lol For both Qwen models, they’re… 17 Hugging Face Daily Papers research 2d ago Trimming the Long-Tail of Visual World Modeling Evaluation Abstract Current visual world models demonstrate limited generalization beyond common physical interactions, struggling with rare and irregular scenarios despite achieving realism on standard benchmarks. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Physical interactions follow a… 28 arXiv — NLP / Computation & Language research 2d ago Open but Incompatible: A License Compatibility Analysis of Corpora for Low-Resource African Languages arXiv:2606.28867v1 Announce Type: new Abstract: Creative Commons licenses dominate African NLP corpus releases, but their compatibility rules are rarely applied. CC-BY-SA and CC-BY-NC cannot be combined in a single published dataset; a NoDerivs clause silently prohibits… 28 arXiv — NLP / Computation & Language research 2d ago Fine-Tuning General-Purpose Large Language Models for Agricultural Applications:A Reproducible Framework and Evaluation Protocol Based on Qwen3-8B arXiv:2606.28992v1 Announce Type: new Abstract: General-purpose large language models (LLMs) have demonstrated strong abilities in opendomain question answering, information extraction, and text generation. Agricultural applications, however, are domain-specific,… 20 arXiv — NLP / Computation & Language research 2d ago Fast Numbers, Slow Language: Bridging Quantitative and Qualitative Earnings Signals arXiv:2606.29734v1 Announce Type: new Abstract: Earnings announcements release two types of information sequentially: quantitative surprise (numeric earnings-per-share (EPS)/revenue versus analyst estimate) arrives first in press releases and financial news, processed by… 12 arXiv — NLP / Computation & Language research 2d ago Are We Measuring Strategy or Phrasing? The Gap Between Surface- and Approach-Level Diversity in LLM Math Reasoning arXiv:2606.29985v1 Announce Type: new Abstract: Diversity in LLM mathematical reasoning is critical for exploration, but common diversity metrics mostly capture surface-level variation rather than differences in how a problem is solved. We address this gap by introducing… 27 Hugging Face Daily Papers research 2d ago LiveEdit: Towards Real-Time Diffusion-Based Streaming Video Editing Abstract A novel streaming video editing framework enables causal, frame-by-frame editing with stable long-horizon preservation and real-time responsiveness through a three-stage distillation pipeline and AR-oriented mask cache. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 24 Hugging Face Daily Papers research 2d ago Geometric Stability of Neural Population Codes: Regional Variation, Behavioral Relevance, and Circuit Dependence Abstract Geometric stability measures the consistency of pairwise stimulus distances across trials, revealing a distinct aspect of neural representation that differs from temporal stability and decoding accuracy. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Current models of… 27 Hugging Face Daily Papers research 2d ago Walking in the Implicit: Interactive World Exploration via Neural Scene Representation Abstract NeuWorld enables efficient interactive video generation by representing scenes as compact neural implicit states and using a transformer VAE with diffusion transformer for trajectory-conditioned rendering. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Interactive video… 25 Hugging Face Daily Papers research 2d ago SafePyramid: A Hierarchical Benchmark for In-context Policy Guardrailing Abstract SafePyramid benchmark evaluates guardrail systems' ability to identify safety violations through in-context policy specification across multiple domains and complexity levels. Generated by Qwen/Qwen2.5-Coder-32B-Instruct In real-world applications, guardrails are often… 5 Hugging Face Daily Papers research 2d ago PoseShield: Neural Collision Fields for Human Self-Collision Resolution Abstract PoseShield addresses self-collision issues in SMPL-based human pose estimation by applying neural collision constraints in pose space through constrained optimization and Eikonal regularization. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Self-collision remains a… 15 Hugging Face Daily Papers research 2d ago Orca: The World is in Your Mind Abstract Orca establishes a unified world latent space through next-state-prediction modeling using multimodal data and demonstrates superior performance in downstream tasks compared to specialized baselines. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We introduce Orca, an… 38 Hugging Face Daily Papers research 2d ago ReFreeKV: Towards Threshold-Free KV Cache Compression Abstract ReFreeKV addresses the limitations of threshold-dependent KV cache pruning by introducing a threshold-free approach that adaptively allocates compression budgets while maintaining full-cache performance across diverse datasets and model sizes. Generated by… 31 Hugging Face Daily Papers research 2d ago Monte Carlo Energy Aggregation for Mobile 3D Gaussian Splatting Abstract Flux-GS enables real-time high-fidelity 3D Gaussian Splatting on mobile platforms through efficient lighting representation, attribute-conditioned enhancement, and multi-view densification strategies. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Recent advances in 3D… 10 Hugging Face Daily Papers research 2d ago Nemotron-Labs-Diffusion-Image: Advancing Masked Discrete Diffusion for High-Resolution Image Synthesis Abstract A masked discrete diffusion model for text-to-image synthesis that addresses limitations in token refinement and training efficiency through novel mechanisms and optimizations. Generated by Qwen/Qwen2.5-Coder-32B-Instruct We propose Nemotron-Labs-Diffusion-Image, a… 25 Hugging Face Daily Papers research 2d ago PolicyGuard: A Dialogue-Grounded Sub-Agent Verifier for Policy Adherence in LLM Agents Abstract POLICYGUARD is a sub-agent verifier that enhances LLM agent policy adherence by providing contextual reasoning and conversation-specific feedback across multi-turn interactions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct LLM agents handle user requests on behalf of… 11 TechCrunch — AI news-outlet 2d ago Vibe coding platform Base44 launches own model as AI startups seek defensibility Wix-owned vibe coding platform Base44 has started rolling out its own AI model — with hopes that it will eventually outperform frontier models. 8 Hugging Face Daily Papers research 2d ago GUICrafter: Weakly-Supervised GUI Agent Leveraging Massive Unannotated Screenshots Abstract GUICrafter addresses GUI agent data challenges through a weakly-supervised approach using unannotated screenshots and a two-stage curriculum learning framework for visual grounding and reinforcement learning calibration. Generated by Qwen/Qwen2.5-Coder-32B-Instruct… 10 Hugging Face Daily Papers research 2d ago Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction Abstract Epi2Diff framework transforms LRM reasoning traces into cognitive episodes to predict human item difficulty more accurately than existing methods. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Predicting human item difficulty is central to educational assessment, where… 8 Hugging Face Daily Papers research 2d ago MIMFlow: Integrating Masked Image Modeling with Normalizing Flows for End-to-End Image Generation Abstract MIMFlow combines Normalizing Flows with Masked Image Modeling to improve generative modeling by decoupling semantic representation from pixel-level details, achieving better performance with fewer tokens. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Normalizing Flows… 37 r/LocalLLaMA community 2d ago How I'm using local models from real-world coding Just want to share since after many attempts over the past year, I finally have a setup I kinda like and does useful work for me. I only have 32GB of RAM and a 4070 8GB (laptop), just very ordinary hardware. I found that Qwen3.6-35B-A3B runs reliably at about 15 tokens per… 25 r/LocalLLaMA community 2d ago Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought Been running Qwen3.6-27B (8-bit) through my coding harness for a few days, alongside GLM5.2. The harness uses 3 critics — code review, test review, Playwright e2e — each with fresh context before accepting output. Qwen3.6 is legit for a 27B dense model. Benchmarks weren't lying.… 19 Vercel — AI dev-tools 2d ago Run multiple frameworks in one project with Vercel Services You can now deploy multiple frontends and backends together within a single Vercel project. Vercel Services is now available , allowing you to deploy full stack apps with multiple frameworks on a shared domain, where services talk to each other privately and deployments build,… 29 Vercel — AI dev-tools 2d ago Introducing VCR: Vercel Container Registry You can now push, pull, and manage container images directly on Vercel. Vercel Container Registry is an OCI-compliant image registry hosted on Vercel's infrastructure. It works with standard workflows - simply docker push , docker pull , and docker tag - so there's nothing new… 37 Vercel — AI dev-tools 2d ago Vercel Sandbox now support Custom Images Vercel Sandboxes now supports custom images. Launching in public beta today, images allow Sandboxes to start with your own custom root filesystem. Images are pulled from Vercel Container Registry , so anything you docker push is immediately available. Bring your own OS,… 21 Vercel — AI dev-tools 2d ago Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) now on AI Gateway Nano Banana 2 Lite from Google is now available on AI Gateway . This Flash-Lite-tier image model is built for fast, low-cost generation. It generates images alongside text in The cost is also lower than previous Nano Banana models. Nano Banana 2 Lite generates 1K images at… 17 OpenAI official-blog 2d ago Introducing GeneBench-Pro Introducing GeneBench-Pro, a new benchmark testing AI performance in genomics, biology, and scientific research using complex, real-world datasets. 22 Vercel — AI dev-tools 2d ago Claude Sonnet 5 now available on Vercel AI Gateway Claude Sonnet 5 from Anthropic is now available on AI Gateway . Sonnet 5 improves on Sonnet 4.6 across coding and agentic work, reaching outcomes on many tasks that previously needed an Opus model, at Sonnet pricing. The model is more agentic and follows instructions more… 14 Vercel — AI dev-tools 2d ago Vercel Private Blob is now generally available Vercel Private Blob is now generally available for all plans. Store sensitive files like user-uploaded photos, invoices, and agent memory, and control exactly who can read them. Private stores, Signed URLs, and OIDC authentication all graduate from beta with this release. Vercel… 22 Vercel — AI dev-tools 2d ago An expanded Vercel Agent: chat, investigations, and approved actions, now in public beta Today, we're launching expanded capabilities for Vercel Agent in public beta. Vercel Agent now lives in your dashboard and can investigate production issues, answer questions about your projects, and take action on your behalf. Because Agent runs inside the platform that deploys… 27 r/LocalLLaMA community 2d ago Introducing LongCat-2.0 - , a large-scale MoE language model with 1.6 trillion total parameters and ~48 billion activated per token. This was the stealth model that was on Openrouter under the name 'owl-alpha'.   submitted by   /u/AnticitizenPrime [link]   [comments] 18 r/LocalLLaMA community 2d ago Ornith 35B works reasonably well with Qwen3.6 35B DFlash speculative model I saw a solid 30-40% token gen increase from this: ./llama-server --no-mmap --port 8080 --host 0.0.0.0 -kvu -ts 75,70 \ --alias qwen -hf bartowski/deepreinforce-ai_Ornith-1.0-35B-GGUF:Q8_0 -sm layer -c 255000 -cram 0 \ -ctk f16 -ctv f16 -fa 1 --jinja -t 7 --metrics --temp 0.6… 12 LangChain releases dev-tools 2d ago langchain-openrouter==0.2.5 Changes since langchain-openrouter==0.2.4 release(openrouter): 0.2.5 ( #38553 ) fix(openrouter): deduplicate repeated finish metadata ( #38552 ) fix(openrouter): strip Responses reasoning IDs ( #38383 ) 32 r/LocalLLaMA community 2d ago It’s time, Sam, it’s time. I mean….. I’m no CEO…. but it seems like this would be the absolute perfect time to drop a super powerful GPT-OSS-2 to throw a big ol’ wet blanket on Anthropic’s IPO. It doesn’t need to be like frontier or anything, just a 20b and a 120b that is as fast as the old versions, add… 31 Ollama releases dev-tools 2d ago v0.31.0 launch: check for min version for hermes desktop ( #16912 ) 4 r/LocalLLaMA community 2d ago DeepSeek V4, PR merged into llama.cpp ! The PR : https://github.com/ggml-org/llama.cpp/pull/24162 All to git pull, cmake , and download GGUFs ! A vos marques, prêt, partez !   submitted by   /u/Squik67 [link]   [comments] 4 r/LocalLLaMA community 2d ago Qwen3-tts.cpp + Compose Desktop GUI I improved my qwen3-tts.cpp implementation to be about 5x realtime on my RTX 5080. It is GGML based, so it should compile and run anywhere - however I only tested it with CPU & CUDA under Windows & Linux: https://github.com/Danmoreng/qwen3-tts.cpp Additionally I made a Desktop… 13 TechCrunch — AI news-outlet 2d ago Anthropic and Gov. Newsom forge deal allowing California government to use Claude at half price As Anthropic forges a closer relationship with the state of California, the federal government has made an enemy out of the OpenAI rival. 26 TechCrunch — AI news-outlet 2d ago Arena, the AI leaderboard everyone uses, is now a $100M business The startup, which runs a popular free AI leaderboard, launched its commercial service just last September. 23 Hacker News — AI on Front Page community 2d ago Qwen 3.6 27B is the sweet spot for local development Article URL: https://quesma.com/blog/qwen-36-is-awesome/ Comments URL: https://news.ycombinator.com/item?id=48721903 Points: 204 # Comments: 133 7 TechCrunch — AI news-outlet 2d ago Cursor now has a mobile app for guiding your coding agent on the go Cursor has launched a new mobile app for remote oversight over coding agents. 29 r/MachineLearning community 2d ago I'm trying to implement CALM paper, and I have some questions. [P] Hello, I'm trying to implement the Pocket TTS by kyutai-labs represented by this paper . Since they have didn't released the training/fine-tuning code. I'm trying to implement it on my own for learning some stuff. I have read the paper, tried to implement it with much more… 34 Simon Willison community 2d ago Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding Ornith-1.0: Self-Scaffolding LLMs for Agentic Coding This is an interesting new open weights (MIT licensed) model, the first model release from DeepReinforce. [...] with variants including 9B Dense, 31B Dense, 35B MoE, and 397B MoE. Built on top of pretrained Gemma 4 and Qwen… 5 Page 3 of 10 · 500 articles ← Newer Older →