Tag

Developer Tool

500 articles archived under #developer-tool · RSS

Vercel — AI dev-tools 1mo ago

Grok Build 0.1 now available on Vercel AI Gateway

Grok Build 0.1 is now available on Vercel AI Gateway . This is a beta coding model trained for agentic coding, currently in early access, and powers the Grok Build CLI app. Reasoning effort is not configurable, and there is no non-reasoning mode. To use Grok Build 0.1, set model…

12
arXiv — Machine Learning research 1mo ago

Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning

arXiv:2605.18892v1 Announce Type: new Abstract: Federated learning (FL) enables collaborative learning of computer vision models, where privacy and regulatory constraints prevent centralizing data across devices or organizations. However, practical FL deployments often exhibit…

38
arXiv — NLP / Computation & Language research 1mo ago

Prompting language influences diagnostic reasoning and accuracy of large language models

arXiv:2605.19173v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly explored for clinical decision support, yet most evaluations are conducted in English, leaving their reliability in other languages uncertain. Here we evaluate the impact of prompting…

4
arXiv — NLP / Computation & Language research 1mo ago

Synthesis and Evaluation of Long-term History-aware Medical Dialogue

arXiv:2605.19766v1 Announce Type: new Abstract: An effective healthcare agent must be able to recall and reason over a patient's longitudinal medical history. However, the absence of datasets with realistic long-term dialogue timelines limits systematic evaluation. Real clinical…

35
arXiv — NLP / Computation & Language research 1mo ago

CLIF: Concept-Level Influence Functions for Transparent Bottleneck Models

arXiv:2605.19848v1 Announce Type: new Abstract: In recent years, the black-box nature of deep learning models has limited their application in high-stakes domains such as medical diagnosis and finance, where interpretability is essential. To address this, we propose a novel…

18
arXiv — NLP / Computation & Language research 1mo ago

PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

arXiv:2605.20052v1 Announce Type: new Abstract: Automatic report labeling facilitates the identification of clinical findings from unstructured text and enables large-scale annotation for medical imaging research. Existing rule-based labelers struggle with the diverse…

13
arXiv — NLP / Computation & Language research 1mo ago

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning

arXiv:2605.20176v1 Announce Type: new Abstract: Large language models (LLMs) and agentic systems have shown promise for clinical decision support, but existing works largely assume that evidence has already been curated and handed to the model. Real-world clinical workflows…

19
Vercel — AI dev-tools 1mo ago

Chat SDK now includes AI SDK tools

Chat SDK now ships a built-in AI SDK toolset through the new chat/ai subpath. One createChatTools(chat) call wires Chat SDK's read and write actions into your agent. Approval by default: write tools are gated by a requireApproval option. Presets: reader , messenger , and…

8
Vercel — AI dev-tools 1mo ago

Chat SDK adds message subjects and direct SDK access

You can now read the parent issue or pull request when your bot is mentioned in a Linear or GitHub comment. message.subject resolves to that parent with title, status, URL, and the full typed payload. message.subject is cached per message, so repeated access only hits the API…

13
Vercel — AI dev-tools 1mo ago

Chat SDK now supports callback URLs on buttons and modals

You can now pause a Workflow run on a Chat SDK card and resume it when someone clicks a button. The same flow works for form submissions. Buttons and modals accept a new callbackUrl prop, and the event payload is sent to that endpoint. To build a card like this, create a…

36
Hacker News — AI on Front Page community 1mo ago

Remove–AI–Watermarks – CLI and library for removing AI watermarks from images

Article URL: https://github.com/wiltodelta/remove-ai-watermarks Comments URL: https://news.ycombinator.com/item?id=48200569 Points: 252 # Comments: 138

4
Hacker News — AI on Front Page community 1mo ago

Gemini CLI will stop working from June 18, 2026

Article URL: https://developers.googleblog.com/an-important-update-transitioning-gemini-cli-to-antigravity-cli/ Comments URL: https://news.ycombinator.com/item?id=48196867 Points: 216 # Comments: 114

15
TechCrunch — AI news-outlet 1mo ago

Agentic app coding gets an upgrade with Google’s release of Android CLI

Google is embracing the rise of AI coding agents with new Android tools designed to work with platforms like Claude Code and OpenAI’s Codex, allowing developers — or their AI assistants — to build Android apps faster from the command line.

14
TechCrunch — AI news-outlet 1mo ago

Google launches Antigravity 2.0 with an updated desktop app and CLI tool

Google is debuting a new AI Ultra plan priced at $100, which will give users 5x more usage limit than the AI Pro plan alongside Antigravity 2.0 launch.

36
TechCrunch — AI news-outlet 1mo ago

Google launches Antigravity 2.0 with an updated desktop app and CLI tool at IO 2026

Google is debuting a new AI Ultra plan priced at $100, which will give users 5x more usage limit than the AI Pro plan alongside the Antigravity 2.0 launch.

13
Vercel — AI dev-tools 1mo ago

Nuxt MCP Toolkit now supports MCP apps

The Nuxt MCP Toolkit now supports MCP apps . Your agent tools can return interactive HTML responses that MCP clients like Claude and ChatGPT render inline, rather than plain-text responses. Declare a tool with the defineMcpApp macro, then read pre-hydrated data, trigger…

29
Anthropic SDK (Python) releases dev-tools 1mo ago

v0.103.0

0.103.0 (2026-05-19) Full Changelog: v0.102.0...v0.103.0 Features client: Add support for self-hosted sandboxes in CMA with sandbox helpers ( e5625b0 )

22
arXiv — Machine Learning research 1mo ago

Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories

arXiv:2605.16319v1 Announce Type: new Abstract: Medium-horizon Alzheimer's disease progression prediction is difficult because future clinical scores can remain tied to baseline severity, while biomarker histories are irregular and incompletely observed. We develop an…

6
arXiv — Machine Learning research 1mo ago

Federated Nested Learning: Collaborative Training of Self-Referential Memories for Test-Time Adaptation

arXiv:2605.16350v1 Announce Type: new Abstract: We rethink Federated Learning (FL) from a nested learning perspective, framing the core challenge as how to collaboratively learn optimization rules, not just static models, to tackle Non-IID client data. To address this, we…

4
arXiv — Machine Learning research 1mo ago

ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction

arXiv:2605.16380v1 Announce Type: new Abstract: Clinical time-series data are difficult to model with methods designed for regular sequences because they exhibit irregular sampling, frequent missing values, and heterogeneous observation patterns across variables. Existing…

6
arXiv — Machine Learning research 1mo ago

GPU-Accelerated Deep Learning for Heatwave Prediction and Urban Heat Risk Assessment

arXiv:2605.16435v1 Announce Type: new Abstract: Heatwaves are an important problem in cities, and climate change makes this problem more difficult. In this paper, we present a GPU-based deep learning framework for next-day prediction of urban thermal conditions and for heat risk…

28
arXiv — Machine Learning research 1mo ago

Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers

arXiv:2605.16438v1 Announce Type: new Abstract: Federated Learning (FL) trains a global model across decentralized clients while preserving data privacy, but at scale it is vulnerable to malicious updates. Byzantine-resilient aggregation methods such as MultiKrum score gradients…

23
arXiv — Machine Learning research 1mo ago

SCOUT: Cyclic Causal Discovery Under Soft Interventions with Unknown Targets

arXiv:2605.16620v1 Announce Type: new Abstract: Learning causal relationships between variables from data is a fundamental research area with many applications across disciplines. Most existing causal discovery algorithms rely on the assumptions that (i) the underlying system is…

33
arXiv — Machine Learning research 1mo ago

MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis

arXiv:2605.16639v1 Announce Type: new Abstract: Multimodal clinical prediction faces three challenges: multiple foundation models (FMs) with complementary strengths per modality, pervasive missing modalities at training and test time, and sample-specific variation in modality…

7
arXiv — Machine Learning research 1mo ago

UB-SMoE: Universally Balanced Sparse Mixture-of-Experts for Resource-adaptive Federated Fine-tuning of Foundation Models

arXiv:2605.16690v1 Announce Type: new Abstract: Heterogeneous LoRA-rank methods address system heterogeneity in federated fine-tuning of foundation models by assigning client-specific ranks based on computational capabilities. However, these methods achieve only marginal…

32
arXiv — NLP / Computation & Language research 1mo ago

Artificial Intolerance: Stigmatizing Language in Clinical Documentation Skews Large Language Model Decision-Making

arXiv:2605.17228v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in high-stakes domains such as clinical decision support and medical documentation. However, the robustness of these models against subtle linguistic variations, specifically…

19
arXiv — NLP / Computation & Language research 1mo ago

Transitivity Meets Cyclicity: Explicit Preference Decomposition for Dynamic Large Language Model Alignment

arXiv:2605.17342v1 Announce Type: new Abstract: Standard RLHF relies on transitive scalar rewards, failing to capture the cyclic nature of human preferences. While some approaches like the General Preference Model (GPM) address this, we identify a theoretical limitation: their…

11
arXiv — NLP / Computation & Language research 1mo ago

Bridging the Version Gap: Multi-version Training Improves ICD Code Prediction, Especially for Rare Codes

arXiv:2605.17755v1 Announce Type: new Abstract: Clinical coding maps clinical documentation to standardized medical codes, an essential yet time-consuming administrative task that could benefit from automation. Current models on ICD coding are typically optimized for codes from…

4
arXiv — NLP / Computation & Language research 1mo ago

Systematic Evaluation of the Quality of Synthetic Clinical Notes Rephrased by LLMs at Million-Note Scale

arXiv:2605.17775v1 Announce Type: new Abstract: Large language models (LLMs) can generate or synthesize clinical text for a wide range of applications, from improving clinical documentation to augmenting clinical text analytics. Yet evaluations typically focus on a narrow aspect…

8
r/LocalLLaMA community 1mo ago

favorite Agentic Coding Harness

So far, I’ve tried Codex CLI, Claude Code, Gemini CLI, OpenCode, and recently, Pi with local models. Pi is the leanest of them all, with just four tools: read, write, edit, and bash. Its system prompt is only under 2K tokens, and it's perfect for local models. I've been trying…

29
Hacker News — AI on Front Page community 1mo ago

Pope Leo XIV’s first encyclical Magnifica humanitas to be published May 25

Article URL: https://www.vaticannews.va/en/pope/news/2026-05/pope-leo-xiv-first-encyclical-magnifica-humanitas.html Comments URL: https://news.ycombinator.com/item?id=48187201 Points: 255 # Comments: 176

17
Hacker News — AI on Front Page community 1mo ago

Click (2016)

Article URL: https://clickclickclick.click/ Comments URL: https://news.ycombinator.com/item?id=48187054 Points: 237 # Comments: 57

35
llama.cpp releases dev-tools 1mo ago

b9216

ui: Refactor models store, MCP service, and gate logs behind VITE_DEBUG ( #23236 ) refactor: Scope console logs to DEV + VITE_DEBUG env vars refactor: skip MCP proxy probe when no server requires it refactor: suppress expected disconnect errors during MCP client shutdown…

33
GitHub Blog — AI & ML official-blog 1mo ago

Take your local GitHub sessions anywhere

Kick off work in VS Code or the CLI, finish it from your phone. Remote control for GitHub Copilot sessions is now generally available on github.com and GitHub Mobile. The post Take your local GitHub sessions anywhere appeared first on The GitHub Blog .

32
Hugging Face Daily Papers research 1mo ago

Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models

Abstract SAE-FT enables robust fine-tuning of vision-language models by regularizing visual representations through sparse autoencoder constraints, maintaining performance while improving robustness against distribution shifts. AI-generated summary Large-scale pre-trained…

34
arXiv — Machine Learning research 1mo ago

MuteBench: Modality Unavailability Tolerance Evaluation for Incomplete Multimodal Fusion

arXiv:2605.15235v1 Announce Type: new Abstract: Multimodal physiological data powers clinical AI systems from intensive care units to wearable devices, but sensors routinely fail in practice. Two failure modes are common: modality missing, where an entire channel is absent, and…

15
arXiv — Machine Learning research 1mo ago

Logical Grammar Induction via Graph Kolmogorov Complexity: A Neuro-Symbolic Framework for Self-Healing Clinical Data Integrity

arXiv:2605.15242v1 Announce Type: new Abstract: The reliability of Healthcare Information Systems (HIS) is frequently compromised by human-induced data entry errors, which existing statistical anomaly detection methods fail to distinguish from legitimate clinical extremes. This…

34
arXiv — Machine Learning research 1mo ago

PACER: Acyclic Causal Discovery from Large-Scale Interventional Data

arXiv:2605.15353v1 Announce Type: new Abstract: Inferring the structure of directed acyclic graphs (DAGs) from data is a central challenge in causal discovery, particularly in modern high-dimensional settings where large-scale interventional data are increasingly available.…

10
arXiv — Machine Learning research 1mo ago

GOMA: Toward Structure-Driven Multimodal Alignment from a Graph Signal Smoothing Perspective

arXiv:2605.15723v1 Announce Type: new Abstract: Multimodal alignment is commonly learned from isolated image-text pairs via CLIP-style dual encoders, leaving the relational context among entities largely unused. Multimodal attributed graphs (MAGs), where nodes carry multimodal…

37
arXiv — NLP / Computation & Language research 1mo ago

Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction

arXiv:2605.15467v1 Announce Type: new Abstract: Conversational nurse-patient transcripts contain actionable observations, but converting these transcripts into structured representations at scale remains challenging. Documentation burden is substantial, with prior studies…

31
arXiv — NLP / Computation & Language research 1mo ago

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models

arXiv:2605.15589v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used in the mental health domain, yet it remains unclear how well they capture related biomedical knowledge and how reliably they apply it to clinically salient structured judgments.…

23
arXiv — NLP / Computation & Language research 1mo ago

Few-Shot Large Language Models for Actionable Triage Categorization of Online Patient Inquiries

arXiv:2605.15680v1 Announce Type: new Abstract: Online patient inquiries are often informal, incomplete, and written before professional assessment, yet they must still be routed to an appropriate level of clinical follow-up. We study this as a four-class actionable triage task…

18
arXiv — NLP / Computation & Language research 1mo ago

Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

arXiv:2605.16077v1 Announce Type: new Abstract: Accurate assessment of cognitive decline from spontaneous speech remains challenging due to limited dataset size and class imbalance. In this work, we propose a large language model (LLM)-driven data augmentation framework to…

38
arXiv — NLP / Computation & Language research 1mo ago

Fully Open Meditron: An Auditable Pipeline for Clinical LLMs

arXiv:2605.16215v1 Announce Type: cross Abstract: Clinical decision support systems (CDSS) require scrutable, auditable pipelines that enable rigorous, reproducible validation. Yet current LLM-based CDSS remain largely opaque. Most "open" models are open-weight only, releasing…

9
arXiv — NLP / Computation & Language research 1mo ago

When Importance Sampling Misallocates Credit: Asymmetric Ratios for Outcome-Supervised RL

arXiv:2510.06062v2 Announce Type: replace Abstract: Reinforcement learning (RL) has shown great promise in large language models (LLMs) post-training, which typically rely on token-level clipping to maintain stability during optimization. Despite the empirical success of…

29
r/LocalLLaMA community 1mo ago

Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags.

Introducing Hexllama Hey, I’ve always found llama-server to be more than enough for testing out local models, mostly because it guarantees you always have the absolute latest llama.cpp features and architecture support. But keeping track of different CLI commands, context sizes,…

19
llama.cpp releases dev-tools 1mo ago

b9193

server : honor --embd-normalize CLI arg ( #23125 ) The --embd-normalize flag was registered only for the embedding and debug examples, so llama-server rejected it and the /embedding handler used a hard-coded default of 2 (L2). Add LLAMA_EXAMPLE_SERVER to the flag's example set…

7
Hacker News — AI on Front Page community 1mo ago

Fecal transplants for autism deliver success in clinical trials

Article URL: https://refractor.io/adhd-autism/fecal-transplants-for-autism-delivers-success-in-clinical-trials/ Comments URL: https://news.ycombinator.com/item?id=48158494 Points: 213 # Comments: 157

16
r/LocalLLaMA community 1mo ago

Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!

Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard! little-coder × Qwen3.6-35B-A3B hit 24.6% (±3.2), and now land above Gemini 2.5 Pro on Gemini CLI (19.6%) and Qwen3-Coder-480B on Terminus 2 (23.9%). I didn’t expect the scaffold-model gap from…

13
OpenAI Python SDK releases dev-tools 1mo ago

v2.37.0

2.37.0 (2026-05-13) Full Changelog: v2.36.0...v2.37.0 Features api: add service_tier parameter to responses compact method ( 625827c ) internal/types: support eagerly validating pydantic iterators ( 7e527bc ) Remove unnecessary client_id when using workload identity provider for…

15

Grok Build 0.1 now available on Vercel AI Gateway

Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning

Prompting language influences diagnostic reasoning and accuracy of large language models

Synthesis and Evaluation of Long-term History-aware Medical Dialogue

CLIF: Concept-Level Influence Functions for Transparent Bottleneck Models

PromptRad: Knowledge-Enhanced Multi-Label Prompt-Tuning for Low-Resource Radiology Report Labeling

ClinSeekAgent: Automating Multimodal Evidence Seeking for Agentic Clinical Reasoning

Chat SDK now includes AI SDK tools

Chat SDK adds message subjects and direct SDK access

Chat SDK now supports callback URLs on buttons and modals

Remove–AI–Watermarks – CLI and library for removing AI watermarks from images

Gemini CLI will stop working from June 18, 2026

Agentic app coding gets an upgrade with Google&#8217;s release of Android CLI

Google launches Antigravity 2.0 with an updated desktop app and CLI tool

Google launches Antigravity 2.0 with an updated desktop app and CLI tool at IO 2026

Nuxt MCP Toolkit now supports MCP apps

v0.103.0

Forecasting Medium-Horizon Alzheimer's Disease Progression: Residual Gap-Aware Transformers for 24-Month CDR-SB Change from ADNI Clinical and Biomarker Histories

Federated Nested Learning: Collaborative Training of Self-Referential Memories for Test-Time Adaptation

ReTAMamba: Reliability-Aware Temporal Aggregation with Mamba for Irregular Clinical Time Series Prediction

GPU-Accelerated Deep Learning for Heatwave Prediction and Urban Heat Risk Assessment

Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers

SCOUT: Cyclic Causal Discovery Under Soft Interventions with Unknown Targets

MedMIX: Modality-Internal Expert Fusion for Multimodal Medical Diagnosis

UB-SMoE: Universally Balanced Sparse Mixture-of-Experts for Resource-adaptive Federated Fine-tuning of Foundation Models

Artificial Intolerance: Stigmatizing Language in Clinical Documentation Skews Large Language Model Decision-Making

Transitivity Meets Cyclicity: Explicit Preference Decomposition for Dynamic Large Language Model Alignment

Bridging the Version Gap: Multi-version Training Improves ICD Code Prediction, Especially for Rare Codes

Systematic Evaluation of the Quality of Synthetic Clinical Notes Rephrased by LLMs at Million-Note Scale

favorite Agentic Coding Harness

Pope Leo XIV’s first encyclical Magnifica humanitas to be published May 25

Click (2016)

b9216

Take your local GitHub sessions anywhere

Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models

MuteBench: Modality Unavailability Tolerance Evaluation for Incomplete Multimodal Fusion

Logical Grammar Induction via Graph Kolmogorov Complexity: A Neuro-Symbolic Framework for Self-Healing Clinical Data Integrity

PACER: Acyclic Causal Discovery from Large-Scale Interventional Data

GOMA: Toward Structure-Driven Multimodal Alignment from a Graph Signal Smoothing Perspective

Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction

MHGraphBench: Knowledge Graph-Grounded Benchmarking of Mental Health Knowledge in Large Language Models

Few-Shot Large Language Models for Actionable Triage Categorization of Online Patient Inquiries

Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

Fully Open Meditron: An Auditable Pipeline for Clinical LLMs

When Importance Sampling Misallocates Credit: Asymmetric Ratios for Outcome-Supervised RL

Made a simple template manager and GUI for llama.cpp so I don't have to keep memorizing CLI flags.

b9193

Fecal transplants for autism deliver success in clinical trials

Qwen3.6-35B-A3B and 9B are officially on the public Terminal-Bench 2.0 leaderboard!

v2.37.0

Agentic app coding gets an upgrade with Google’s release of Android CLI