News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow Ars Technica — AI news-outlet 15d ago Anthropic "pauses" token-based billing for its Claude Agent SDK Move originally planned for Monday would have heavily increased power users' costs. 21 r/LocalLLaMA community 15d ago GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench and beats every other open model available From Source: GLM-5.2 is the first open-weights model to cross 80% on Terminal-Bench, and beats every other open model available. It also beats Gemini, making it a frontier-level model for a fraction of the cost. Open weights is back. This model is a game changer. Source: Cline… 14 NVIDIA Developer Blog official-blog 15d ago Build On-Device AI Companions with the NVIDIA ACE Game Agent SDK and Unreal Engine 5 Plugins NVIDIA RTX technologies are deeply integrated into Unreal Engine 5 through the NVIDIA RTX Branch of Unreal Engine and the NVIDIA DLSS Unreal Engine plugin. This... 23 MIT Technology Review — AI news-outlet 16d ago Want to get a data center online quickly? Give it some flex. At the end of a tense and scoreless first half of a soccer match between the English men’s team and rival Germany, millions of Brits let out a collective sigh and did what they so often do in moments of stress: They made tea. That wave of electric kettles clicking on, however,… 26 Hugging Face Daily Papers research 16d ago PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions Abstract PhoneHarness presents a mixed-action benchmark and execution framework for evaluating phone-use agents on verifiable mobile workflows, demonstrating superior performance over existing approaches through deterministic action routing and auditable execution traces.… 13 arXiv — Machine Learning research 16d ago Semantic Reasoning in Medicine: The Role of Knowledge Graphs Across Five Key Domains arXiv:2606.15155v1 Announce Type: new Abstract: Knowledge graphs (KGs) have emerged as a promising solution for integrating and reasoning over complex biomedical and clinical data in healthcare. By representing structured relationships among entities such as diseases, drugs,… 17 arXiv — Machine Learning research 16d ago RECTOR: Masked Region-Channel-Temporal Modeling for Affective and Cognitive Representation Learning arXiv:2606.15278v1 Announce Type: new Abstract: Affective and cognitive disorders manifest as distributed, time-varying brain network dynamics across regions, channels, and time, challenging robust representation learning from EEG/sEEG for clinical diagnosis. We propose RECTOR… 34 arXiv — Machine Learning research 16d ago Beyond Classification: A Cough Regression Benchmark for Respiratory Acoustic Foundation Models arXiv:2606.15436v1 Announce Type: new Abstract: Respiratory acoustic foundation models (FMs) excel at cough classification, yet their ability to predict continuous health quantities from cough audio remains largely unexplored, despite the clinical value of passive age, BMI, and… 28 arXiv — Machine Learning research 16d ago Z-Plane Neural Networks: Bounded Geometric Activation Replaces ReLU and LayerNorm arXiv:2606.15669v1 Announce Type: new Abstract: Modern deep neural networks rely on Euclidean scalar activations (e.g., ReLU) and global normalization techniques (e.g., LayerNorm) to prevent gradient instability in deep architectures. However, these mechanisms inherently cause… 23 arXiv — Machine Learning research 16d ago When Generator Replay Degrades: Projected Rehearsal Orchestration for Heterogeneous Federated Class-Incremental Learning arXiv:2606.15695v1 Announce Type: new Abstract: Federated class-incremental learning (FCIL) becomes substantially harder when clients observe different label subsets, progress through tasks at different stages, and provide uneven supervision for the same semantic concepts.… 26 arXiv — NLP / Computation & Language research 16d ago PhoneHarness: Harnessing Phone-Use Agents through Mixed GUI, CLI, and Tool Actions arXiv:2606.14832v1 Announce Type: new Abstract: Phone agents are increasingly expected to complete real mobile workflows rather than merely predict the next screen action. However, much of the current mobile-agent literature still evaluates agents primarily as GUI controllers… 36 arXiv — NLP / Computation & Language research 16d ago ReportQA: QA-Based Radiology Report Evaluation arXiv:2606.15037v1 Announce Type: new Abstract: Radiology report evaluation is essential for advancing automated report generation. Natural language generation metrics have limited clinical relevance. Clinical efficacy (CE) metrics evaluate important medical findings, but focus… 38 arXiv — NLP / Computation & Language research 16d ago EHRNote-ChatQA: A Benchmark for Evidence-Grounded Multi-Turn Clinical Question Answering over Longitudinal Discharge Summaries arXiv:2606.15735v1 Announce Type: new Abstract: Discharge summaries are crucial clinical documents containing the context of a patient's overall hospital stay, and are routinely reviewed by medical experts for patient readmission, ongoing care, and diagnostic decision-making.… 26 arXiv — NLP / Computation & Language research 16d ago Interactor: Agentic RL oriented Iterative Creation for Ad Description Generation in Sponsored Search arXiv:2606.15911v1 Announce Type: new Abstract: This paper focuses on automatically generating informative ad descriptions in sponsored search. Unlike ad titles which are usually optimized to attract user click feedbacks, ad descriptions have a longer text span and possess the… 8 Vercel — AI dev-tools 16d ago Workflow SDK now supports inflight cancellation The Workflow SDK 5 beta now supports the standard AbortController and AbortSignal APIs across workflow and step boundaries. Create a controller inside a workflow, pass its signal into one or more steps, and cancel in-flight operations using the same API fetch already uses. That… 24 Vercel — AI dev-tools 16d ago Workflow SDK now supports TanStack Start Workflow SDK now supports TanStack Start applications on Vercel. TanStack Start is built on Vite and Nitro , so the existing workflow/vite plugin works directly. Add it to vite.config.ts alongside tanstackStart() . From there, write workflow and step functions in standard… 27 Hacker News — AI on Front Page community 16d ago Ten years of ClickHouse in open source Article URL: https://clickhouse.com/blog/open-source-10 Comments URL: https://news.ycombinator.com/item?id=48546890 Points: 225 # Comments: 65 9 GitHub Blog — AI & ML official-blog 16d ago GitHub Copilot CLI for Beginners: Overview of common slash commands GitHub Copilot CLI for Beginners: Learn how to use slash commands to control your terminal AI agent. The post GitHub Copilot CLI for Beginners: Overview of common slash commands appeared first on The GitHub Blog . 26 r/LocalLLaMA community 16d ago Maybe dumb question, but how do you serve multiple users with the full context length? After experimenting with llama.cpp, I'm wondering a thing. Let's say we have an LLM with a context size of 128k. Now let's say we want have up to 8 parallel users, and we want to provide each client with the full context capabilities. With llama.cpp, how does that work? AFAIK it… 20 Anthropic SDK (Python) releases dev-tools 16d ago v0.109.2 0.109.2 (2026-06-15) Full Changelog: v0.109.1...v0.109.2 Chores api: remove retired models from API and SDKs ( d4bcfcc ) 8 Hacker News — AI on Front Page community 17d ago Apple Foundation Models Article URL: https://platform.claude.com/docs/en/cli-sdks-libraries/libraries/apple-foundation-models Comments URL: https://news.ycombinator.com/item?id=48536776 Points: 305 # Comments: 133 29 arXiv — Machine Learning research 17d ago FedSPC: Shared Parameter Correction for Personalized Federated Learning arXiv:2606.13748v1 Announce Type: new Abstract: Personalized federated learning (PFL) is one of the important approaches in federated learning for addressing statistical heterogeneity while enabling client-specific adaptation. Many PFL methods split the model into shared and… 28 arXiv — Machine Learning research 17d ago Attention-Based Estimation of the Individual Treatment Benefit Probability under Dose Variation arXiv:2606.13821v1 Announce Type: new Abstract: Estimating the probability that a treatment outperforms a control for an individual patient, called the Individual Probability of Treatment Benefit (IPTB), offers a clinically intuitive alternative to population-average metrics.… 36 arXiv — Machine Learning research 17d ago Can Machine Learning Forecast Rice Yields in Data-Constrained Settings? Satellite Climate Data, National Crop Statistics, and Lessons from Sierra Leone arXiv:2606.13959v1 Announce Type: new Abstract: Sierra Leone's agriculture operates with almost no data-driven decision support, and no published machine learning study has examined the country's crop yields. We ask whether rice yield can be forecast from data Sierra Leone… 25 arXiv — Machine Learning research 17d ago Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc Adversarial Auditing and Multi-Agent Feedback Loops arXiv:2606.14149v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs recommend recently banned or… 25 arXiv — Machine Learning research 17d ago Learning Urban Access Costs from Origin-Destination Flows via Inverse Optimal Transport arXiv:2606.14157v1 Announce Type: new Abstract: Cities deliver basic services through mixed public-private facility networks, including schools, clinics, transit providers, and subsidized service points. In these systems, planners often observe where households go, but not the… 9 arXiv — Machine Learning research 17d ago Machine Learning for Biomedical Raman Spectroscopy: From Spectral Acquisition to Clinical Translation arXiv:2606.14169v1 Announce Type: new Abstract: Raman spectroscopy provides label-free, chemically specific characterization of biological systems and has become an important tool for cancer diagnosis, molecular subtyping, microbiological identification, and intraoperative… 13 arXiv — Machine Learning research 17d ago Federated Learning for Feature Generalization with Convex Constraints arXiv:2606.14416v1 Announce Type: new Abstract: Federated learning (FL) often struggles with generalization due to heterogeneous client data. Local models are prone to overfitting their local data distributions, and even transferable features can be distorted during aggregation.… 12 arXiv — Machine Learning research 17d ago PepALD: Macrocyclic Peptide Generation via Autoregressive Latent Diffusion arXiv:2606.14510v1 Announce Type: new Abstract: Macrocyclic peptides are promising therapeutic candidates for intracellular targets, but their design requires simultaneous control over non-natural monomer chemistry, ring topology, membrane permeability, and target binding.… 10 arXiv — Machine Learning research 17d ago Expert-Driven Survival Machines: Improving Stratification and Interpretability in Multiple Clinical Cohorts arXiv:2606.14608v1 Announce Type: new Abstract: Survival prediction plays a central role for healthcare providers and clinical researchers. Accurate risk stratification enables early intervention and improved patient management. Most existing deep survival models learn one… 9 arXiv — NLP / Computation & Language research 17d ago DLawBench: Evaluating LLMs Through Multi-Turn Legal Consultation arXiv:2606.13931v1 Announce Type: new Abstract: Lawyer-client consultation is a critical starting point for legal services. Effective legal assistance hinges on eliciting sufficient and truthful information from clients in order to devise strategies that best protect their… 5 arXiv — NLP / Computation & Language research 17d ago Can Post-Training Turn LLMs into Good Medical Coders? An Empirical Study of Generative ICD Coding arXiv:2606.13940v1 Announce Type: new Abstract: Automated International Classification of Diseases (ICD) coding is a core medical-coding task for billing, epidemiology, and clinical decision support. Generative large language models (LLMs) are often reported as weak medical… 27 arXiv — NLP / Computation & Language research 17d ago Personal Care Utility: Health as Everyday Infrastructure arXiv:2606.14145v1 Announce Type: new Abstract: Healthcare is essential, expert, and episodic by design - built around the roughly one hour per year a person spends with a clinician. The 8,759 hours outside clinical settings, where eating, sleeping, movement, medication, and… 9 arXiv — NLP / Computation & Language research 17d ago A Computational Audit of Demographic Association Encoding in ClinicalBERT Language Predictions arXiv:2606.14460v1 Announce Type: new Abstract: Transformer-based clinical language models are increasingly integrated into high-stakes clinical decision support pipelines, yet the computational mechanisms through which demographic associations encoded in medical documentation… 35 arXiv — NLP / Computation & Language research 17d ago Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources arXiv:2606.14141v1 Announce Type: cross Abstract: Sound events are entities with semantic identities, locations, and trajectories, but current audio-language models usually reason about clips as global event content. Conversely, sound event localization models track source… 12 arXiv — NLP / Computation & Language research 17d ago ClinHallu: A Benchmark for Diagnosing Stage-Wise Hallucinations in Medical MLLM Reasoning arXiv:2606.14697v1 Announce Type: cross Abstract: Building trustworthy medical multimodal large language models (MLLMs) is critical for reliable clinical decision support. Existing medical hallucination benchmarks mainly focus on data collection, but often ignore where… 4 Vercel — AI dev-tools 17d ago Auth0 joins the Vercel Marketplace You can now add Auth0 , a production-ready authentication to your Vercel app in just a few clicks. Built for modern frameworks like Next.js, Auth0 is an identity and access management platform for securing your apps and agentic workflows. This integration enables: Automatic… 26 Vercel — AI dev-tools 17d ago Chat SDK now supports rich text in Telegram Chat SDK now renders explicit markdown and ast messages as native rich messages on the Telegram adapter . Your bots get real headings, lists, tables, task lists, formulas, and separate media blocks instead of flattened text. What you get: Native formatting : headings, lists,… 5 r/LocalLLaMA community 17d ago Gemma 12b less than 10 watts 6.5pp 1.3tg Google pixel 10 pro Termux Llamacpp version: 9639 (ef8268fee) $ ./llama.cpp/build_vulkan/bin/llama-cli -m storage/downloads/gemma-4-12b-it-UD-Q3_K_XL.gguf --model-draft storage/downloads/mtp-gemma-4-12b-it.gguf --temp 1.0 --top-p 0.95 --top-k 64 --spec-type draft-mtp… 5 Hacker News — AI on Front Page community 17d ago I indexed 669 GB of my GoPro videos using my M1 Max computer and local ML models TLDR: I had 2,207 GoPro videos, and I need to rewatch them to find interesting moments from my cycling journey. I built a project to index them locally on my M1 Max using open-source ML models, search for those moments, and send the best clips straight to my DaVinci Resolve… 28 llama.cpp releases dev-tools 18d ago b9631 cli : fix not copying preserved tokens ( #24258 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64, KleidiAI enabled) DISABLED macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64… 6 r/LocalLLaMA community 18d ago Best batteries-included harness tuned for Qwen 3.6 and Gemma 4? (little-coder, smallcode, etc...) After testing little-coder for a week now, I can confidently say that it's better and more reliable than OpenCode and Cline. What's the best harness you've used with Qwen 3.6 and Gemma 4? I'm aware that you can get better results by using pi.dev or a custom harness tuned for… 6 r/LocalLLaMA community 18d ago In your opinion, what is the best CLI-based (or other) coding tool for regular software engineering (NOT VIBE CODING)? This includes but is not only limited to: OpenCode, Command Code, Kilo Code, Cline, Claude Code, etc. Please try to include tools in which I can connect local models, so not stuff like Antigravity.   submitted by   /u/Potential_Top_4669 [link]   [comments] 36 r/LocalLLaMA community 19d ago llama-launcher v1.3 release -> Bayesian Optimisation Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher , a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses… 16 r/MachineLearning community 19d ago Price is not cost: how we are using the wrong variable to measure the cost of LLMs [D] Upfront disclosure: this is my write-up (and I'll link it below), but laying out the argument here so you can strawman/steelman it without clicking anything. Assertion 1: per token price is the wrong metric for measuring the cost of work done by LLMs/reasoning models. Users get… 36 Vercel — AI dev-tools 19d ago Workflow SDK now runs natively in Nitro v3 Workflow SDK 's native Nitro v3 integration is now in beta. Steps run inside the same bundled runtime as the rest of your app, instead of a separate bundle. Nitro's useStorage() and other server-side APIs work directly inside "use step" functions. The Nitro dev server also… 26 GitHub Blog — AI & ML official-blog 19d ago How we made GitHub Copilot CLI more selective about delegation Better orchestration, fewer handoffs, faster progress, without a single new knob. The post How we made GitHub Copilot CLI more selective about delegation appeared first on The GitHub Blog . 25 The Information — AI news-outlet 19d ago SpaceX Shares Open at $150 Per Share SpaceX shares started trading at $150 per share around mid-day on Friday, up 11% from the company’s initial public offering price of $135. Shares of the company climbed shortly after trading began, reaching about $165 shortly after noon. The offering makes SpaceX CEO Elon Musk… 7 arXiv — NLP / Computation & Language research 20d ago EDEN: A Large-Scale Corpus of Clinical Notes for Italian arXiv:2606.12569v1 Announce Type: new Abstract: We present EDEN (Emergency Department Electronic Notes), a new and unique large-scale corpus of clinical notes produced in Emergency Departments of Italian hospitals. The corpus, in its current version, is composed of approximately… 25 arXiv — NLP / Computation & Language research 20d ago sebis at CRF Filling 2026: A Two-Stage Local LLM Pipeline for Medical CRF Filling arXiv:2606.13082v1 Announce Type: new Abstract: The extraction of structured clinical information from unstructured EHR notes is a persistent bottleneck in healthcare informatics. While large language models (LLMs) offer high performance, their deployment in clinical settings is… 12 Page 4 of 10 · 500 articles ← Newer Older →