Smol AI News
135 articles archived · Visit source ↗ · RSS
-
Smol AI News news-outlet 2mo ago
GPT-Image-2
**OpenAI** launched **GPT-Image-2**, enhancing image generation with improved text rendering, layout fidelity, editing, multilingual support, and "thinking" capabilities. It supports generating slides, infographics, diagrams, UI mockups, and QR codes, and integrates with tools…
36 -
Smol AI News news-outlet 2mo ago
not much happened today
**Moonshot's Kimi K2.6** is a major open-weight **1T-parameter MoE** model featuring **32B active parameters**, **384 experts**, **MLA attention**, **256K context window**, native multimodality, and **INT4 quantization**. It supports day-0 integration with platforms like…
9 -
Smol AI News news-outlet 2mo ago
not much happened today
**Anthropic** launched **Claude Design**, a prototyping tool powered by **Claude Opus 4.7**, targeting design workflows and competing with **Figma** and others. Benchmarks show **Opus 4.7** leading in coding and text tasks, with improved efficiency and adaptive reasoning, though…
7 -
Smol AI News news-outlet 2mo ago
Anthropic's Claude Opus 4.7
**Anthropic** launched **Claude Opus 4.7**, its most capable Opus model yet, featuring stronger coding and agentic performance, a new tokenizer, and improved long-context handling with a new **xhigh** reasoning tier. Benchmarks show substantial gains, including **SWE-bench Pro…
37 -
Smol AI News news-outlet 2mo ago
not much happened today
**OpenAI** expanded its Agents SDK by separating the agent harness from compute/storage, enabling long-running, durable agents with features like file/computer use, skills, memory, and compaction. The harness is now open-source and supports execution via partner sandboxes,…
37 -
Smol AI News news-outlet 2mo ago
not much happened today
**Harness engineering** is emerging as a key discipline in AI agent development, emphasizing components like filesystems, memory, and retries beyond just models. **OpenAI's Codex** is expanding agentic coding workflows beyond software engineering, including codebase…
32 -
Smol AI News news-outlet 2mo ago
not much happened today
**GLM-5.1** has reached **#3 on Code Arena**, surpassing **Gemini 3.1** and **GPT-5.4**, and matching **Claude Sonnet 4.6** in coding performance. **Z.ai** now holds the **#1 open model rank** close to the top overall. The advisor pattern, combining a cheap executor with an…
12 -
Smol AI News news-outlet 2mo ago
not much happened today
**Anthropic's Mythos** and **OpenAI's** upcoming restricted cyber-capable models are central to recent discussions, with debates on their security realism and evaluation methods. **LangChain's Deep Agents deploy** introduces an open memory, model-agnostic agent harness…
36 -
Smol AI News news-outlet 2mo ago
not much happened today
**Meta Superintelligence Labs** launched **Muse Spark**, a natively multimodal reasoning model featuring tool use, visual chain of thought, and multi-agent orchestration. It is live on **meta.ai** and the Meta AI app with a private API preview and plans for open-sourcing future…
29 -
-
Smol AI News news-outlet 2mo ago
not much happened today
**Hermes Agent** is gaining attention as a leading open agent stack with features like self-improving skills, persistent memory, and a self-improvement loop. Its new **Manim skill** enables generation of math/technical animations, expanding agent capabilities. The Hermes…
19 -
Smol AI News news-outlet 2mo ago
not much happened today
**Google** introduced **Skills in Chrome**, enabling reusable browser workflows with Gemini prompts and a library of ready-made Skills, enhancing end-user agentization. **Tencent** teased **HYWorld 2.0**, an open-source 3D world model generating editable scenes from a single…
8 -
Smol AI News news-outlet 3mo ago
not much happened today
**Gemma 4** was launched by **Google** under an **Apache 2.0 license**, marking a significant open-model release focused on **reasoning, agentic workflows, multimodality, and on-device use**. It outperforms models 10x larger and has immediate ecosystem support including…
35 -
Smol AI News news-outlet 3mo ago
Gemma 4
**Google DeepMind** released **Gemma 4**, a family of open-weight, multimodal models with long-context support up to **256K tokens** under an **Apache 2.0 license**, marking a major capability and licensing shift. The lineup includes **31B dense**, **26B MoE (A4B)**, and two…
14 -
Smol AI News news-outlet 3mo ago
not much happened today
**Arcee’s Trinity-Large-Thinking** was released with **open weights under Apache 2.0**, featuring a **400B total / 13B active** model size and strong agentic performance, ranking **#2 on PinchBench**. **Z.ai’s GLM-5V-Turbo** is a **vision coding model** with **native multimodal…
13 -
Smol AI News news-outlet 3mo ago
not much happened today
**Anthropic** introduced **computer use inside Claude Code** for closed-loop verification in a research preview for Pro/Max users, enhancing reliable app iteration. **OpenAI** released a **Codex plugin for Claude Code**, enabling cross-agent composition and signaling a shift…
16 -
Smol AI News news-outlet 3mo ago
not much happened today
**Anthropic** is reportedly introducing a new AI model tier called **Capybara**, which is larger and more intelligent than **Claude Opus 4.6**, showing improved performance in coding, academic reasoning, and cybersecurity. The model is speculated to be around **10 trillion…
38 -
Smol AI News news-outlet 3mo ago
not much happened today
**Anthropic** advances agent infrastructure with a multi-agent harness emphasizing orchestration and "computer use" for complex software environments. **Figma**, **GitHub**, and **Cursor** launch design canvases with direct AI editing, showcasing tool-calling becoming…
12 -
Smol AI News news-outlet 3mo ago
not much happened today
**ARC-AGI-3** benchmark introduced by **@arcprize** and **François Chollet** resets the frontier for general agentic reasoning with humans solving 100% of tasks versus under 1% for current models, focusing on zero-preparation generalization and human-like learning efficiency.…
4 -
Smol AI News news-outlet 3mo ago
not much happened today
**Google** launched **Gemini 3.1 Flash Live**, a realtime voice and vision agent model with **2x longer conversation memory**, supporting **70 languages** and **128k context**. **Mistral AI** released **Voxtral TTS**, a low-latency, open-weight text-to-speech model supporting…
31 -
Smol AI News news-outlet 3mo ago
The Claude Code Source Leak
**Anthropic's** closed-source coding product **Claude Code** experienced a significant source leak exposing over **500k lines** of orchestration logic, including autonomous modes and memory systems, but not model weights. The leak led to rapid public reverse-engineering,…
14 -
Smol AI News news-outlet 3mo ago
not much happened today
**Anthropic** introduced **Claude Cowork** and **Claude Code** enabling desktop control of mouse, keyboard, and screen in a **macOS research preview**, expanding agent capabilities beyond APIs and browsers. The agent ecosystem is evolving towards long-running, parallel,…
29 -
Smol AI News news-outlet 3mo ago
not much happened today
**Cursor's Composer 2**, built on **Kimi K2.5**, sparked discussion over model attribution and licensing, highlighting a shift toward post-trained derivatives of open-source models with domain-specific fine-tuning and reinforcement learning. **Claude Code** is expanding into…
36 -
Smol AI News news-outlet 3mo ago
not much happened today
**Cursor** launched **Composer 2**, a frontier-class coding model with major cost reductions and strong benchmark scores like **61.3 on CursorBench** and **73.7 on SWE-bench Multilingual**. The model was improved via a **first continued pretraining run** feeding into…
36 -
-
Smol AI News news-outlet 3mo ago
not much happened today
**OpenAI** released **GPT-5.4 mini** and **GPT-5.4 nano**, their most capable small models optimized for coding, multimodal understanding, and subagents, featuring a **400k context window** and over **2x speed** compared to GPT-5 mini. The mini model approaches larger GPT-5.4…
32 -
Smol AI News news-outlet 3mo ago
not much happened today
**Moonshot's Attention Residuals** paper introduced an input-dependent attention mechanism over prior layers with a **1.25x compute advantage** and less than **2% inference latency overhead**, validated on **Kimi Linear 48B total / 3B active**. The paper sparked debate on…
26 -
Smol AI News news-outlet 3mo ago
not much happened today
**MCP tools** remain relevant for deterministic APIs despite ergonomic criticisms, with new **web MCP support in Chrome v146** enabling continuous browsing agents. Persistent memory is emerging as a key differentiator for agents, with IBM improving task completion rates and…
5 -
Smol AI News news-outlet 3mo ago
not much happened today
**Harnesses, agent infrastructure, and the MCP protocol** are central themes, with emphasis on how **harnesses, sandboxes, filesystem access, skills, memory, and observability** shape agent UI/UX and runtime environments. Despite jokes about MCP's demise, it remains vital in…
26 -
Smol AI News news-outlet 3mo ago
not much happened today
**NVIDIA’s Nemotron 3 Super** is a **120B parameter / ~12B active** open model featuring a **hybrid Mamba-Transformer / SSM Latent MoE** architecture and **1M context window**, delivering up to **2.2x faster inference than GPT-OSS-120B** in FP4 with strong throughput gains. It…
10 -
-
-
Smol AI News news-outlet 3mo ago
not much happened today
**OpenAI** rolled out **GPT-5.4**, achieving tied **#1** on the **Artificial Analysis Intelligence Index** with **Gemini 3.1 Pro Preview** scoring **57** (up from 51 for GPT-5.2 xhigh). GPT-5.4 features a larger **~1.05M token** context window and higher per-token prices…
12 -
-
Smol AI News news-outlet 4mo ago
not much happened today
**Gemini 3.1 Flash-Lite** is highlighted by **Demis Hassabis** for its speed and cost-efficiency, focusing on latency and cost per capability rather than raw performance. **NotebookLM Studio** introduces a new feature for generating immersive cinematic video overviews. Rumors…
20 -
Smol AI News news-outlet 4mo ago
not much happened today
**Google DeepMind** launched **Gemini 3.1 Flash-Lite**, emphasizing *dynamic thinking levels* for adjustable compute, with notable metrics like **$0.25/M input**, **$1.50/M output**, **1432 Elo on LMArena**, and **2.5× faster time-to-first-token** than Gemini 2.5 Flash. It…
35 -
Smol AI News news-outlet 4mo ago
not much happened today
**Alibaba** released the **Qwen 3.5** series with models ranging from **0.8B to 9B** parameters, featuring **native multimodality**, **scaled reinforcement learning**, and targeting **edge and lightweight agent** deployments. The models support very long context windows up to…
18 -
-
-
Smol AI News news-outlet 4mo ago
Agentic Engineering: WTF Happened in December 2025?
**Perplexity** launched **Computer**, an orchestration-first agent platform featuring multi-model routing, usage-based pricing, and parallel asynchronous sub-agents for distributed workflows. **Andrej Karpathy** claims a "phase change" in coding agents since December,…
21 -
-
-
Smol AI News news-outlet 4mo ago
not much happened today
**Gemini 3.1 Pro** demonstrates strong retrieval capabilities and cost efficiency compared to **GPT-5.2** and **Opus 4.6**, though users report tooling and UI issues. The **SWE-bench Verified** evaluation methodology is under scrutiny for consistency, with updates bringing…
27 -
Smol AI News news-outlet 4mo ago
Gemini 3.1 Pro: 2x 3.0 on ARC-AGI 2
**Google** released **Gemini 3.1 Pro**, a developer preview integrated across the **Gemini app**, **NotebookLM**, **Gemini API / AI Studio**, and **Vertex AI**, highlighting a significant reasoning improvement with **ARC-AGI-2 = 77.1%** and strong coding and agentic-tool…
10 -
Smol AI News news-outlet 4mo ago
not much happened today
**Anthropic** released **Claude Opus/Sonnet 4.6**, showing a significant intelligence index jump but with increased token usage and cost. **Anthropic** also shared insights on AI agent autonomy, highlighting human-in-the-loop prevalence and software engineering tool calls.…
5 -
-
-
-
-
Smol AI News news-outlet 4mo ago
Z.ai GLM-5: New SOTA Open Weights LLM
**Zhipu AI** launched **GLM-5**, an **Opus-class** model scaling from **355B to 744B parameters** with **DeepSeek Sparse Attention** integration for cost-efficient long-context serving. GLM-5 achieves **SOTA on BrowseComp** and leads on **Vending Bench 2**, focusing on office…
18