News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow Vercel — AI dev-tools 1mo ago Grok Build 0.1 now available on Vercel AI Gateway Grok Build 0.1 is now available on Vercel AI Gateway . This is a beta coding model trained for agentic coding, currently in early access, and powers the Grok Build CLI app. Reasoning effort is not configurable, and there is no non-reasoning mode. To use Grok Build 0.1, set model… 12 arXiv — Machine Learning research 1mo ago Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning arXiv:2605.18892v1 Announce Type: new Abstract: Federated learning (FL) enables collaborative learning of computer vision models, where privacy and regulatory constraints prevent centralizing data across devices or organizations. However, practical FL deployments often exhibit… 38 arXiv — NLP / Computation & Language research 1mo ago Prompting language influences diagnostic reasoning and accuracy of large language models arXiv:2605.19173v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly explored for clinical decision support, yet most evaluations are conducted in English, leaving their reliability in other languages uncertain. Here we evaluate the impact of prompting… 4 arXiv — NLP / Computation & Language research 1mo ago Synthesis and Evaluation of Long-term History-aware Medical Dialogue arXiv:2605.19766v1 Announce Type: new Abstract: An effective healthcare agent must be able to recall and reason over a patient's longitudinal medical history. However, the absence of datasets with realistic long-term dialogue timelines limits systematic evaluation. Real clinical… 35 arXiv — NLP / Computation & Language research 1mo ago CLIF: Concept-Level Influence Functions for Transparent Bottleneck Models arXiv:2605.19848v1 Announce Type: new Abstract: In recent years, the black-box nature of deep learning models has limited their application in high-stakes domains such as medical diagnosis and finance, where interpretability is essential. To address this, we propose a novel… 18 arXiv — NLP / Computation & Language research 1mo ago PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling arXiv:2605.20052v1 Announce Type: new Abstract: Automatic report labeling facilitates the identification of clinical findings from unstructured text and enables large-scale annotation for medical imaging research. Existing rule-based labelers struggle with the diverse… 13 arXiv — NLP / Computation & Language research 1mo ago ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning arXiv:2605.20176v1 Announce Type: new Abstract: Large language models (LLMs) and agentic systems have shown promise for clinical decision support, but existing works largely assume that evidence has already been curated and handed to the model. Real-world clinical workflows… 19 Vercel — AI dev-tools 1mo ago Chat SDK now includes AI SDK tools Chat SDK now ships a built-in AI SDK toolset through the new chat/ai subpath. One createChatTools(chat) call wires Chat SDK's read and write actions into your agent. Approval by default: write tools are gated by a requireApproval option. Presets: reader , messenger , and… 8 Vercel — AI dev-tools 1mo ago Chat SDK adds message subjects and direct SDK access You can now read the parent issue or pull request when your bot is mentioned in a Linear or GitHub comment. message.subject resolves to that parent with title, status, URL, and the full typed payload. message.subject is cached per message, so repeated access only hits the API… 13 Vercel — AI dev-tools 1mo ago Chat SDK now supports callback URLs on buttons and modals You can now pause a Workflow run on a Chat SDK card and resume it when someone clicks a button. The same flow works for form submissions. Buttons and modals accept a new callbackUrl prop, and the event payload is sent to that endpoint. To build a card like this, create a… 36 Hacker News — AI on Front Page community 1mo ago Remove–AI–Watermarks – CLI and library for removing AI watermarks from images Article URL: https://github.com/wiltodelta/remove-ai-watermarks Comments URL: https://news.ycombinator.com/item?id=48200569 Points: 252 # Comments: 138 4 Hacker News — AI on Front Page community 1mo ago Gemini CLI will stop working from June 18, 2026 Article URL: https://developers.googleblog.com/an-important-update-transitioning-gemini-cli-to-antigravity-cli/ Comments URL: https://news.ycombinator.com/item?id=48196867 Points: 216 # Comments: 114 15 TechCrunch — AI news-outlet 1mo ago Agentic app coding gets an upgrade with Google’s release of Android CLI Google is embracing the rise of AI coding agents with new Android tools designed to work with platforms like Claude Code and OpenAI’s Codex, allowing developers — or their AI assistants — to build Android apps faster from the command line. 14 TechCrunch — AI news-outlet 1mo ago Google launches Antigravity 2.0 with an updated desktop app and CLI tool Google is debuting a new AI Ultra plan priced at $100, which will give users 5x more usage limit than the AI Pro plan alongside Antigravity 2.0 launch. 36 TechCrunch — AI news-outlet 1mo ago Google launches Antigravity 2.0 with an updated desktop app and CLI tool at IO 2026 Google is debuting a new AI Ultra plan priced at $100, which will give users 5x more usage limit than the AI Pro plan alongside the Antigravity 2.0 launch. 13 Vercel — AI dev-tools 1mo ago Nuxt MCP Toolkit now supports MCP apps The Nuxt MCP Toolkit now supports MCP apps . Your agent tools can return interactive HTML responses that MCP clients like Claude and ChatGPT render inline, rather than plain-text responses. Declare a tool with the defineMcpApp macro, then read pre-hydrated data, trigger… 29 Anthropic SDK (Python) releases dev-tools 1mo ago v0.103.0 0.103.0 (2026-05-19) Full Changelog: v0.102.0...v0.103.0 Features client: Add support for self-hosted sandboxes in CMA with sandbox helpers ( e5625b0 ) 22 arXiv — Machine Learning research 1mo ago Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories arXiv:2605.16319v1 Announce Type: new Abstract: Medium-horizon Alzheimer's disease progression prediction is difficult because future clinical scores can remain tied to baseline severity, while biomarker histories are irregular and incompletely observed. We develop an… 6 arXiv — Machine Learning research 1mo ago Federated Nested Learning: Collaborative Training of Self-Referential Memories for Test-Time Adaptation arXiv:2605.16350v1 Announce Type: new Abstract: We rethink Federated Learning (FL) from a nested learning perspective, framing the core challenge as how to collaboratively learn optimization rules, not just static models, to tackle Non-IID client data. To address this, we… 4 arXiv — Machine Learning research 1mo ago ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction arXiv:2605.16380v1 Announce Type: new Abstract: Clinical time-series data are difficult to model with methods designed for regular sequences because they exhibit irregular sampling, frequent missing values, and heterogeneous observation patterns across variables. Existing… 6 arXiv — Machine Learning research 1mo ago GPU-Accelerated Deep Learning for Heatwave Prediction and Urban Heat Risk Assessment arXiv:2605.16435v1 Announce Type: new Abstract: Heatwaves are an important problem in cities, and climate change makes this problem more difficult. In this paper, we present a GPU-based deep learning framework for next-day prediction of urban thermal conditions and for heat risk… 28 arXiv — Machine Learning research 1mo ago Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers arXiv:2605.16438v1 Announce Type: new Abstract: Federated Learning (FL) trains a global model across decentralized clients while preserving data privacy, but at scale it is vulnerable to malicious updates. Byzantine-resilient aggregation methods such as MultiKrum score gradients… 23 arXiv — Machine Learning research 1mo ago SCOUT: Cyclic Causal Discovery Under Soft Interventions with Unknown Targets arXiv:2605.16620v1 Announce Type: new Abstract: Learning causal relationships between variables from data is a fundamental research area with many applications across disciplines. Most existing causal discovery algorithms rely on the assumptions that (i) the underlying system is… 33 arXiv — Machine Learning research 1mo ago MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis arXiv:2605.16639v1 Announce Type: new Abstract: Multimodal clinical prediction faces three challenges: multiple foundation models (FMs) with complementary strengths per modality, pervasive missing modalities at training and test time, and sample-specific variation in modality… 7 arXiv — Machine Learning research 1mo ago UB-SMoE: Universally Balanced Sparse Mixture-of-Experts for Resource-adaptive Federated Fine-tuning of Foundation Models arXiv:2605.16690v1 Announce Type: new Abstract: Heterogeneous LoRA-rank methods address system heterogeneity in federated fine-tuning of foundation models by assigning client-specific ranks based on computational capabilities. However, these methods achieve only marginal… 32 arXiv — NLP / Computation & Language research 1mo ago Artificial Intolerance: Stigmatizing Language in Clinical Documentation Skews Large Language Model Decision-Making arXiv:2605.17228v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in high-stakes domains such as clinical decision support and medical documentation. However, the robustness of these models against subtle linguistic variations, specifically… 19 arXiv — NLP / Computation & Language research 1mo ago Transitivity Meets Cyclicity: Explicit Preference Decomposition for Dynamic Large Language Model Alignment arXiv:2605.17342v1 Announce Type: new Abstract: Standard RLHF relies on transitive scalar rewards, failing to capture the cyclic nature of human preferences. While some approaches like the General Preference Model (GPM) address this, we identify a theoretical limitation: their… 11 arXiv — NLP / Computation & Language research 1mo ago Bridging the Version Gap: Multi-version Training Improves ICD Code Prediction, Especially for Rare Codes arXiv:2605.17755v1 Announce Type: new Abstract: Clinical coding maps clinical documentation to standardized medical codes, an essential yet time-consuming administrative task that could benefit from automation. Current models on ICD coding are typically optimized for codes from… 4 arXiv — NLP / Computation & Language research 1mo ago Systematic Evaluation of the Quality of Synthetic Clinical Notes Rephrased by LLMs at Million-Note Scale arXiv:2605.17775v1 Announce Type: new Abstract: Large language models (LLMs) can generate or synthesize clinical text for a wide range of applications, from improving clinical documentation to augmenting clinical text analytics. Yet evaluations typically focus on a narrow aspect… 8 r/LocalLLaMA community 1mo ago favorite Agentic Coding Harness So far, I’ve tried Codex CLI, Claude Code, Gemini CLI, OpenCode, and recently, Pi with local models. Pi is the leanest of them all, with just four tools: read, write, edit, and bash. Its system prompt is only under 2K tokens, and it's perfect for local models. I've been trying… 29 Hacker News — AI on Front Page community 1mo ago Pope Leo XIV’s first encyclical Magnifica humanitas to be published May 25 Article URL: https://www.vaticannews.va/en/pope/news/2026-05/pope-leo-xiv-first-encyclical-magnifica-humanitas.html Comments URL: https://news.ycombinator.com/item?id=48187201 Points: 255 # Comments: 176 17 Hacker News — AI on Front Page community 1mo ago Click (2016) Article URL: https://clickclickclick.click/ Comments URL: https://news.ycombinator.com/item?id=48187054 Points: 237 # Comments: 57 35 llama.cpp releases dev-tools 1mo ago b9216 ui: Refactor models store, MCP service, and gate logs behind VITE_DEBUG ( #23236 ) refactor: Scope console logs to DEV + VITE_DEBUG env vars refactor: skip MCP proxy probe when no server requires it refactor: suppress expected disconnect errors during MCP client shutdown… 33 GitHub Blog — AI & ML official-blog 1mo ago Take your local GitHub sessions anywhere Kick off work in VS Code or the CLI, finish it from your phone. Remote control for GitHub Copilot sessions is now generally available on github.com and GitHub Mobile. The post Take your local GitHub sessions anywhere appeared first on The GitHub Blog . 32 Hugging Face Daily Papers research 1mo ago Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models Abstract SAE-FT enables robust fine-tuning of vision-language models by regularizing visual representations through sparse autoencoder constraints, maintaining performance while improving robustness against distribution shifts. AI-generated summary Large-scale pre-trained… 34 arXiv — Machine Learning research 1mo ago MuteBench: Modality Unavailability Tolerance Evaluation for Incomplete Multimodal Fusion arXiv:2605.15235v1 Announce Type: new Abstract: Multimodal physiological data powers clinical AI systems from intensive care units to wearable devices, but sensors routinely fail in practice. Two failure modes are common: modality missing, where an entire channel is absent, and… 15 arXiv — Machine Learning research 1mo ago Logical Grammar Induction via Graph Kolmogorov Complexity: A Neuro-Symbolic Framework for Self-Healing Clinical Data Integrity arXiv:2605.15242v1 Announce Type: new Abstract: The reliability of Healthcare Information Systems (HIS) is frequently compromised by human-induced data entry errors, which existing statistical anomaly detection methods fail to distinguish from legitimate clinical extremes. This… 34 arXiv — Machine Learning research 1mo ago PACER: Acyclic Causal Discovery from Large-Scale Interventional Data arXiv:2605.15353v1 Announce Type: new Abstract: Inferring the structure of directed acyclic graphs (DAGs) from data is a central challenge in causal discovery, particularly in modern high-dimensional settings where large-scale interventional data are increasingly available.… 10 arXiv — Machine Learning research 1mo ago GOMA: Toward Structure-Driven Multimodal Alignment from a Graph Signal Smoothing Perspective arXiv:2605.15723v1 Announce Type: new Abstract: Multimodal alignment is commonly learned from isolated image-text pairs via CLIP-style dual encoders, leaving the relational context among entities largely unused. Multimodal attributed graphs (MAGs), where nodes carry multimodal… 37 arXiv — NLP / Computation & Language research 1mo ago Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction arXiv:2605.15467v1 Announce Type: new Abstract: Conversational nurse-patient transcripts contain actionable observations, but converting these transcripts into structured representations at scale remains challenging. Documentation burden is substantial, with prior studies… 31 arXiv — NLP / Computation & Language research 1mo ago MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models arXiv:2605.15589v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in the mental health domain, yet it remains unclear how well they capture related biomedical knowledge and how reliably they apply it to clinically salient structured judgments.… 23 arXiv — NLP / Computation & Language research 1mo ago Few-Shot Large Language Models for Actionable Triage Categorization of Online Patient Inquiries arXiv:2605.15680v1 Announce Type: new Abstract: Online patient inquiries are often informal, incomplete, and written before professional assessment, yet they must still be routed to an appropriate level of clinical follow-up. We study this as a four-class actionable triage task… 18 arXiv — NLP / Computation & Language research 1mo ago Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction arXiv:2605.16077v1 Announce Type: new Abstract: Accurate assessment of cognitive decline from spontaneous speech remains challenging due to limited dataset size and class imbalance. In this work, we propose a large language model (LLM)-driven data augmentation framework to… 38 arXiv — NLP / Computation & Language research 1mo ago Fully Open Meditron: An Auditable Pipeline for Clinical LLMs arXiv:2605.16215v1 Announce Type: cross Abstract: Clinical decision support systems (CDSS) require scrutable, auditable pipelines that enable rigorous, reproducible validation. Yet current LLM-based CDSS remain largely opaque. Most "open" models are open-weight only, releasing… 9 arXiv — NLP / Computation & Language research 1mo ago When Importance Sampling Misallocates Credit: Asymmetric Ratios for Outcome-Supervised RL arXiv:2510.06062v2 Announce Type: replace Abstract: Reinforcement learning (RL) has shown great promise in large language models (LLMs) post-training, which typically rely on token-level clipping to maintain stability during optimization. Despite the empirical success of… 29 r/LocalLLaMA community 1mo ago Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags. Introducing Hexllama Hey, I’ve always found llama-server to be more than enough for testing out local models, mostly because it guarantees you always have the absolute latest llama.cpp features and architecture support. But keeping track of different CLI commands, context sizes,… 19 llama.cpp releases dev-tools 1mo ago b9193 server : honor --embd-normalize CLI arg ( #23125 ) The --embd-normalize flag was registered only for the embedding and debug examples, so llama-server rejected it and the /embedding handler used a hard-coded default of 2 (L2). Add LLAMA_EXAMPLE_SERVER to the flag's example set… 7 Hacker News — AI on Front Page community 1mo ago Fecal transplants for autism deliver success in clinical trials Article URL: https://refractor.io/adhd-autism/fecal-transplants-for-autism-delivers-success-in-clinical-trials/ Comments URL: https://news.ycombinator.com/item?id=48158494 Points: 213 # Comments: 157 16 r/LocalLLaMA community 1mo ago Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard! Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard! little-coder × Qwen3.6-35B-A3B hit 24.6% (±3.2), and now land above Gemini 2.5 Pro on Gemini CLI (19.6%) and Qwen3-Coder-480B on Terminus 2 (23.9%). I didn’t expect the scaffold-model gap from… 13 OpenAI Python SDK releases dev-tools 1mo ago v2.37.0 2.37.0 (2026-05-13) Full Changelog: v2.36.0...v2.37.0 Features api: add service_tier parameter to responses compact method ( 625827c ) internal/types: support eagerly validating pydantic iterators ( 7e527bc ) Remove unnecessary client_id when using workload identity provider for… 15 Page 10 of 10 · 500 articles ← Newer