Tag

Developer Tool

500 articles archived under #developer-tool · RSS

Ars Technica — AI news-outlet 15d ago

Anthropic "pauses" token-based billing for its Claude Agent SDK

Move originally planned for Monday would have heavily increased power users' costs.

21
r/LocalLLaMA community 15d ago

GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available

From Source: GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available. It also beats Gemini, making it a frontier-level model for a fraction of the cost. Open weights is back. This model is a game changer. Source: Cline…

14
NVIDIA Developer Blog official-blog 15d ago

Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins

NVIDIA RTX technologies are deeply integrated into Unreal Engine 5 through the NVIDIA RTX Branch of Unreal Engine and the NVIDIA DLSS Unreal Engine plugin. This...

23
MIT Technology Review — AI news-outlet 16d ago

Want to get a data center online quickly? Give it some flex.

At the end of a tense and scoreless first half of a soccer match between the English men’s team and rival Germany, millions of Brits let out a collective sigh and did what they so often do in moments of stress: They made tea. That wave of electric kettles clicking on, however,…

26
Hugging Face Daily Papers research 16d ago

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Abstract PhoneHarness presents a mixed-action benchmark and execution framework for evaluating phone-use agents on verifiable mobile workflows, demonstrating superior performance over existing approaches through deterministic action routing and auditable execution traces.…

13
arXiv — Machine Learning research 16d ago

Semantic Reasoning in Medicine: The Role of Knowledge Graphs Across Five Key Domains

arXiv:2606.15155v1 Announce Type: new Abstract: Knowledge graphs (KGs) have emerged as a promising solution for integrating and reasoning over complex biomedical and clinical data in healthcare. By representing structured relationships among entities such as diseases, drugs,…

17
arXiv — Machine Learning research 16d ago

RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning

arXiv:2606.15278v1 Announce Type: new Abstract: Affective and cognitive disorders manifest as distributed, time-varying brain network dynamics across regions, channels, and time, challenging robust representation learning from EEG/sEEG for clinical diagnosis. We propose RECTOR…

34
arXiv — Machine Learning research 16d ago

Beyond Classification: A Cough Regression Benchmark for Respiratory Acoustic Foundation Models

arXiv:2606.15436v1 Announce Type: new Abstract: Respiratory acoustic foundation models (FMs) excel at cough classification, yet their ability to predict continuous health quantities from cough audio remains largely unexplored, despite the clinical value of passive age, BMI, and…

28
arXiv — Machine Learning research 16d ago

Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm

arXiv:2606.15669v1 Announce Type: new Abstract: Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause…

23
arXiv — Machine Learning research 16d ago

When Generator Replay Degrades: Projected Rehearsal Orchestration for Heterogeneous Federated Class-Incremental Learning

arXiv:2606.15695v1 Announce Type: new Abstract: Federated class-incremental learning (FCIL) becomes substantially harder when clients observe different label subsets, progress through tasks at different stages, and provide uneven supervision for the same semantic concepts.…

26
arXiv — NLP / Computation & Language research 16d ago

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

arXiv:2606.14832v1 Announce Type: new Abstract: Phone agents are increasingly expected to complete real mobile workflows rather than merely predict the next screen action. However, much of the current mobile-agent literature still evaluates agents primarily as GUI controllers…

36
arXiv — NLP / Computation & Language research 16d ago

ReportQA: QA-Based Radiology Report Evaluation

arXiv:2606.15037v1 Announce Type: new Abstract: Radiology report evaluation is essential for advancing automated report generation. Natural language generation metrics have limited clinical relevance. Clinical efficacy (CE) metrics evaluate important medical findings, but focus…

38
arXiv — NLP / Computation & Language research 16d ago

EHRNote-ChatQA: A Benchmark for Evidence-Grounded Multi-Turn Clinical Question Answering over Longitudinal Discharge Summaries

arXiv:2606.15735v1 Announce Type: new Abstract: Discharge summaries are crucial clinical documents containing the context of a patient's overall hospital stay, and are routinely reviewed by medical experts for patient readmission, ongoing care, and diagnostic decision-making.…

26
arXiv — NLP / Computation & Language research 16d ago

Interactor: Agentic RL oriented Iterative Creation for Ad Description Generation in Sponsored Search

arXiv:2606.15911v1 Announce Type: new Abstract: This paper focuses on automatically generating informative ad descriptions in sponsored search. Unlike ad titles which are usually optimized to attract user click feedbacks, ad descriptions have a longer text span and possess the…

8
Vercel — AI dev-tools 16d ago

Workflow SDK now supports inflight cancellation

The Workflow SDK 5 beta now supports the standard AbortController and AbortSignal APIs across workflow and step boundaries. Create a controller inside a workflow, pass its signal into one or more steps, and cancel in-flight operations using the same API fetch already uses. That…

24
Vercel — AI dev-tools 16d ago

Workflow SDK now supports TanStack Start

Workflow SDK now supports TanStack Start applications on Vercel. TanStack Start is built on Vite and Nitro , so the existing workflow/vite plugin works directly. Add it to vite.config.ts alongside tanstackStart() . From there, write workflow and step functions in standard…

27
Hacker News — AI on Front Page community 16d ago

Ten years of ClickHouse in open source

Article URL: https://clickhouse.com/blog/open-source-10 Comments URL: https://news.ycombinator.com/item?id=48546890 Points: 225 # Comments: 65

9
GitHub Blog — AI & ML official-blog 16d ago

GitHub Copilot CLI for Beginners: Overview of common slash commands

GitHub Copilot CLI for Beginners: Learn how to use slash commands to control your terminal AI agent. The post GitHub Copilot CLI for Beginners: Overview of common slash commands appeared first on The GitHub Blog .

26
r/LocalLLaMA community 16d ago

Maybe dumb question, but how do you serve multiple users with the full context length?

After experimenting with llama.cpp, I'm wondering a thing. Let's say we have an LLM with a context size of 128k. Now let's say we want have up to 8 parallel users, and we want to provide each client with the full context capabilities. With llama.cpp, how does that work? AFAIK it…

20
Anthropic SDK (Python) releases dev-tools 16d ago

v0.109.2

0.109.2 (2026-06-15) Full Changelog: v0.109.1...v0.109.2 Chores api: remove retired models from API and SDKs ( d4bcfcc )

8
Hacker News — AI on Front Page community 17d ago

Apple Foundation Models

Article URL: https://platform.claude.com/docs/en/cli-sdks-libraries/libraries/apple-foundation-models Comments URL: https://news.ycombinator.com/item?id=48536776 Points: 305 # Comments: 133

29
arXiv — Machine Learning research 17d ago

FedSPC: Shared Parameter Correction for Personalized Federated Learning

arXiv:2606.13748v1 Announce Type: new Abstract: Personalized federated learning (PFL) is one of the important approaches in federated learning for addressing statistical heterogeneity while enabling client-specific adaptation. Many PFL methods split the model into shared and…

28
arXiv — Machine Learning research 17d ago

Attention-Based Estimation of the Individual Treatment Benefit Probability under Dose Variation

arXiv:2606.13821v1 Announce Type: new Abstract: Estimating the probability that a treatment outperforms a control for an individual patient, called the Individual Probability of Treatment Benefit (IPTB), offers a clinically intuitive alternative to population-average metrics.…

36
arXiv — Machine Learning research 17d ago

Can Machine Learning Forecast Rice Yields in Data-Constrained Settings? Satellite Climate Data, National Crop Statistics, and Lessons from Sierra Leone

arXiv:2606.13959v1 Announce Type: new Abstract: Sierra Leone's agriculture operates with almost no data-driven decision support, and no published machine learning study has examined the country's crop yields. We ask whether rice yield can be forecast from data Sierra Leone…

25
arXiv — Machine Learning research 17d ago

Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc Adversarial Auditing and Multi-Agent Feedback Loops

arXiv:2606.14149v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs recommend recently banned or…

25
arXiv — Machine Learning research 17d ago

Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport

arXiv:2606.14157v1 Announce Type: new Abstract: Cities deliver basic services through mixed public-private facility networks, including schools, clinics, transit providers, and subsidized service points. In these systems, planners often observe where households go, but not the…

9
arXiv — Machine Learning research 17d ago

Machine Learning for Biomedical Raman Spectroscopy: From Spectral Acquisition to Clinical Translation

arXiv:2606.14169v1 Announce Type: new Abstract: Raman spectroscopy provides label-free, chemically specific characterization of biological systems and has become an important tool for cancer diagnosis, molecular subtyping, microbiological identification, and intraoperative…

13
arXiv — Machine Learning research 17d ago

Federated Learning for Feature Generalization with Convex Constraints

arXiv:2606.14416v1 Announce Type: new Abstract: Federated learning (FL) often struggles with generalization due to heterogeneous client data. Local models are prone to overfitting their local data distributions, and even transferable features can be distorted during aggregation.…

12
arXiv — Machine Learning research 17d ago

PepALD: Macrocyclic Peptide Generation via Autoregressive Latent Diffusion

arXiv:2606.14510v1 Announce Type: new Abstract: Macrocyclic peptides are promising therapeutic candidates for intracellular targets, but their design requires simultaneous control over non-natural monomer chemistry, ring topology, membrane permeability, and target binding.…

10
arXiv — Machine Learning research 17d ago

Expert-Driven Survival Machines: Improving Stratification and Interpretability in Multiple Clinical Cohorts

arXiv:2606.14608v1 Announce Type: new Abstract: Survival prediction plays a central role for healthcare providers and clinical researchers. Accurate risk stratification enables early intervention and improved patient management. Most existing deep survival models learn one…

9
arXiv — NLP / Computation & Language research 17d ago

DLawBench: Evaluating LLMs Through Multi-Turn Legal Consultation

arXiv:2606.13931v1 Announce Type: new Abstract: Lawyer-client consultation is a critical starting point for legal services. Effective legal assistance hinges on eliciting sufficient and truthful information from clients in order to devise strategies that best protect their…

5
arXiv — NLP / Computation & Language research 17d ago

Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding

arXiv:2606.13940v1 Announce Type: new Abstract: Automated International Classification of Diseases (ICD) coding is a core medical-coding task for billing, epidemiology, and clinical decision support. Generative large language models (LLMs) are often reported as weak medical…

27
arXiv — NLP / Computation & Language research 17d ago

Personal Care Utility: Health as Everyday Infrastructure

arXiv:2606.14145v1 Announce Type: new Abstract: Healthcare is essential, expert, and episodic by design - built around the roughly one hour per year a person spends with a clinician. The 8,759 hours outside clinical settings, where eating, sleeping, movement, medication, and…

9
arXiv — NLP / Computation & Language research 17d ago

A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions

arXiv:2606.14460v1 Announce Type: new Abstract: Transformer-based clinical language models are increasingly integrated into high-stakes clinical decision support pipelines, yet the computational mechanisms through which demographic associations encoded in medical documentation…

35
arXiv — NLP / Computation & Language research 17d ago

Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources

arXiv:2606.14141v1 Announce Type: cross Abstract: Sound events are entities with semantic identities, locations, and trajectories, but current audio-language models usually reason about clips as global event content. Conversely, sound event localization models track source…

12
arXiv — NLP / Computation & Language research 17d ago

ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning

arXiv:2606.14697v1 Announce Type: cross Abstract: Building trustworthy medical multimodal large language models (MLLMs) is critical for reliable clinical decision support. Existing medical hallucination benchmarks mainly focus on data collection, but often ignore where…

4
Vercel — AI dev-tools 17d ago

Auth0 joins the Vercel Marketplace

You can now add Auth0 , a production-ready authentication to your Vercel app in just a few clicks. Built for modern frameworks like Next.js, Auth0 is an identity and access management platform for securing your apps and agentic workflows. This integration enables: Automatic…

26
Vercel — AI dev-tools 17d ago

Chat SDK now supports rich text in Telegram

Chat SDK now renders explicit markdown and ast messages as native rich messages on the Telegram adapter . Your bots get real headings, lists, tables, task lists, formulas, and separate media blocks instead of flattened text. What you get: Native formatting : headings, lists,…

5
r/LocalLLaMA community 17d ago

Gemma 12b less than 10 watts 6.5pp 1.3tg

Google pixel 10 pro Termux Llamacpp version: 9639 (ef8268fee) $ ./llama.cpp/build_vulkan/bin/llama-cli -m storage/downloads/gemma-4-12b-it-UD-Q3_K_XL.gguf --model-draft storage/downloads/mtp-gemma-4-12b-it.gguf --temp 1.0 --top-p 0.95 --top-k 64 --spec-type draft-mtp…

5
Hacker News — AI on Front Page community 17d ago

I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models

TLDR: I had 2,207 GoPro videos, and I need to rewatch them to find interesting moments from my cycling journey. I built a project to index them locally on my M1 Max using open-source ML models, search for those moments, and send the best clips straight to my DaVinci Resolve…

28
llama.cpp releases dev-tools 18d ago

b9631

cli : fix not copying preserved tokens ( #24258 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64…

6
r/LocalLLaMA community 18d ago

Best batteries-included harness tuned for Qwen 3.6 and Gemma 4? (little-coder, smallcode, etc...)

After testing little-coder for a week now, I can confidently say that it's better and more reliable than OpenCode and Cline. What's the best harness you've used with Qwen 3.6 and Gemma 4? I'm aware that you can get better results by using pi.dev or a custom harness tuned for…

6
r/LocalLLaMA community 18d ago

In your opinion, what is the best CLI-based (or other) coding tool for regular software engineering (NOT VIBE CODING)?

This includes but is not only limited to: OpenCode, Command Code, Kilo Code, Cline, Claude Code, etc. Please try to include tools in which I can connect local models, so not stuff like Antigravity.   submitted by   /u/Potential_Top_4669 [link]   [comments]

36
r/LocalLLaMA community 19d ago

llama-launcher v1.3 release -> Bayesian Optimisation

Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher , a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses…

16
r/MachineLearning community 19d ago

Price is not cost: how we are using the wrong variable to measure the cost of LLMs [D]

Upfront disclosure: this is my write-up (and I'll link it below), but laying out the argument here so you can strawman/steelman it without clicking anything. Assertion 1: per token price is the wrong metric for measuring the cost of work done by LLMs/reasoning models. Users get…

36
Vercel — AI dev-tools 19d ago

Workflow SDK now runs natively in Nitro v3

Workflow SDK 's native Nitro v3 integration is now in beta. Steps run inside the same bundled runtime as the rest of your app, instead of a separate bundle. Nitro's useStorage() and other server-side APIs work directly inside "use step" functions. The Nitro dev server also…

26
GitHub Blog — AI & ML official-blog 19d ago

How we made GitHub Copilot CLI more selective about delegation

Better orchestration, fewer handoffs, faster progress, without a single new knob. The post How we made GitHub Copilot CLI more selective about delegation appeared first on The GitHub Blog .

25
The Information — AI news-outlet 19d ago

SpaceX Shares Open at $150 Per Share

SpaceX shares started trading at $150 per share around mid-day on Friday, up 11% from the company’s initial public offering price of $135. Shares of the company climbed shortly after trading began, reaching about $165 shortly after noon. The offering makes SpaceX CEO Elon Musk…

7
arXiv — NLP / Computation & Language research 20d ago

EDEN: A Large-Scale Corpus of Clinical Notes for Italian

arXiv:2606.12569v1 Announce Type: new Abstract: We present EDEN (Emergency Department Electronic Notes), a new and unique large-scale corpus of clinical notes produced in Emergency Departments of Italian hospitals. The corpus, in its current version, is composed of approximately…

25
arXiv — NLP / Computation & Language research 20d ago

sebis at CRF Filling 2026: A Two-Stage Local LLM Pipeline for Medical CRF Filling

arXiv:2606.13082v1 Announce Type: new Abstract: The extraction of structured clinical information from unstructured EHR notes is a persistent bottleneck in healthcare informatics. While large language models (LLMs) offer high performance, their deployment in clinical settings is…

12

Anthropic "pauses" token-based billing for its Claude Agent SDK

GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available

Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins

Want to get a data center online quickly? Give it some flex.

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

Semantic Reasoning in Medicine: The Role of Knowledge Graphs Across Five Key Domains

RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning

Beyond Classification: A Cough Regression Benchmark for Respiratory Acoustic Foundation Models

Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm

When Generator Replay Degrades: Projected Rehearsal Orchestration for Heterogeneous Federated Class-Incremental Learning

PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions

ReportQA: QA-Based Radiology Report Evaluation

EHRNote-ChatQA: A Benchmark for Evidence-Grounded Multi-Turn Clinical Question Answering over Longitudinal Discharge Summaries

Interactor: Agentic RL oriented Iterative Creation for Ad Description Generation in Sponsored Search

Workflow SDK now supports inflight cancellation

Workflow SDK now supports TanStack Start

Ten years of ClickHouse in open source

GitHub Copilot CLI for Beginners: Overview of common slash commands

Maybe dumb question, but how do you serve multiple users with the full context length?

v0.109.2

Apple Foundation Models

FedSPC: Shared Parameter Correction for Personalized Federated Learning

Attention-Based Estimation of the Individual Treatment Benefit Probability under Dose Variation

Can Machine Learning Forecast Rice Yields in Data-Constrained Settings? Satellite Climate Data, National Crop Statistics, and Lessons from Sierra Leone

Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc Adversarial Auditing and Multi-Agent Feedback Loops

Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport

Machine Learning for Biomedical Raman Spectroscopy: From Spectral Acquisition to Clinical Translation

Federated Learning for Feature Generalization with Convex Constraints

PepALD: Macrocyclic Peptide Generation via Autoregressive Latent Diffusion

Expert-Driven Survival Machines: Improving Stratification and Interpretability in Multiple Clinical Cohorts

DLawBench: Evaluating LLMs Through Multi-Turn Legal Consultation

Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding

Personal Care Utility: Health as Everyday Infrastructure

A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions

Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources

ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning

Auth0 joins the Vercel Marketplace

Chat SDK now supports rich text in Telegram

Gemma 12b less than 10 watts 6.5pp 1.3tg

I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models

b9631

Best batteries-included harness tuned for Qwen 3.6 and Gemma 4? (little-coder, smallcode, etc...)

In your opinion, what is the best CLI-based (or other) coding tool for regular software engineering (NOT VIBE CODING)?

llama-launcher v1.3 release -> Bayesian Optimisation

Price is not cost: how we are using the wrong variable to measure the cost of LLMs [D]

Workflow SDK now runs natively in Nitro v3

How we made GitHub Copilot CLI more selective about delegation

SpaceX Shares Open at $150 Per Share

EDEN: A Large-Scale Corpus of Clinical Notes for Italian

sebis at CRF Filling 2026: A Two-Stage Local LLM Pipeline for Medical CRF Filling