News / #developer-tool Tag Developer Tool 500 articles archived under #developer-tool · RSS Sign in to follow arXiv — Machine Learning research 2h ago A Filtered Mixture-of-Generators for Fully Synthetic Survival Training arXiv:2607.00127v1 Announce Type: new Abstract: Survival analysis models time-to-event data, but in clinical settings training data are costly and scarce: events accrue over years of follow-up, cohorts are small, and privacy regulations restrict sharing across institutions.… 26 arXiv — Machine Learning research 2h ago Entropy-Regularized Probabilistic Gates for Sparse Model Discovery in Scarce-Data Federated Learning arXiv:2607.00275v1 Announce Type: new Abstract: Federated Learning (FL) is a distributed machine learning (ML) paradigm with collaboration among multiple clients without sharing data. FL is challenging under data heterogeneity and partial client participation. Learning sparse… 14 arXiv — Machine Learning research 2h ago LLM-Guided ODE Discovery and Parameter Inference from Small-Cohort Aggregate Data arXiv:2607.00733v1 Announce Type: new Abstract: Mechanistic modeling via ordinary differential equations (ODEs) provides interpretable descriptions of complex dynamics and enables inference of underlying mechanisms, which is particularly valuable in clinical settings. However,… 36 arXiv — Machine Learning research 2h ago Automatic Detection of Stress from Speech in the Trier Social Stress Test arXiv:2607.00986v1 Announce Type: new Abstract: Automatically detecting stress in speech provides an unobtrusive way to gain insights relevant to behavioral research or clinical assessment. This study investigates the automatic differentiation between a stressful and… 12 arXiv — NLP / Computation & Language research 2h ago Selective Test-Time Debiasing for CLIP via Reward Gating arXiv:2607.00423v1 Announce Type: new Abstract: Vision language models (VLMs) demonstrate strong zero-shot performance, but often perpetuate social stereotypes in person-centric queries, yielding skewed demographic distributions. Current debiasing methods apply uniform bias… 22 arXiv — NLP / Computation & Language research 2h ago Dynamic Bidirectional Pattern Memory: A Production-Scale Empirical Characterisation of Inference-Time Gating in Clinical NLP arXiv:2607.00870v1 Announce Type: new Abstract: We study inference-time pattern-memory gating in a production-scale clinical natural language processing (NLP) pipeline. The pipeline pairs a generator (Llama-3.3 70B) proposing extractions with a verifier (MMed-Llama-3.1 70B)… 33 arXiv — NLP / Computation & Language research 2h ago Clinician-Level Agreement Without Clinical Caution: LLM Evaluator Limits in Medical AI Benchmarking arXiv:2607.01103v1 Announce Type: new Abstract: Open-response evaluation provides stronger clinical validity than multiple-choice benchmarks but creates a scoring bottleneck that motivates automated LLM-asa-Judge approaches. Whether such evaluators replicate clinical calibration… 12 Anthropic SDK (Python) releases dev-tools 8h ago v0.115.1 0.115.1 (2026-07-01) Full Changelog: v0.115.0...v0.115.1 Chores api: remove some nonfunctional types from the SDKs ( 5e7c431 ) 27 r/LocalLLaMA community 9h ago End of an Agony. Real production service that uses LLM to earn money my team had made and now we are so happy that it will die. Here are some of my final "experiences". Hello everyone. I had posted in this sub about making a production service about 8 months ago. Here the link of my previous post . The idea was the same. We wanted to make a real production service that we can provide to clients to earn money. AI assistant that works through… 30 llama.cpp releases dev-tools 11h ago b9859 opencl: allow loading precompiled binary kernels from library ( #23042 ) opencl: allow loading binary kernel opencl: add libdl.h ggml-backend-dl is in ggml, which depends backend libs, thus ggml-opencl cannot depend on ggml-backend-dl add libdl.h to break cyclic dep opencl:… 6 arXiv — Machine Learning research 1d ago Accelerometry-Derived Digital Biomarkers for Cardiometabolic Risk: A Population-Representative Tabular Benchmark with Uncertainty Quantification arXiv:2606.30702v1 Announce Type: new Abstract: Structured tabular data dominates clinical medicine, yet existing benchmarks fail to reflect real-world properties like complex survey sampling, demographic oversampling, and subgroup fairness. We introduce the NHANES Accelerometry… 31 arXiv — Machine Learning research 1d ago Mind the Residual Gap: Probabilistic Downscaling under Real-World Bias arXiv:2606.30821v1 Announce Type: new Abstract: Probabilistic downscaling is the task of modeling the conditional distribution of high-resolution fields given coarse inputs, and is a central challenge to atmospheric science, climate modeling, and other multiscale physical… 21 arXiv — Machine Learning research 1d ago Teaching LLMs to Recommend and Defer in Underrepresented Epilepsy Care arXiv:2606.31036v1 Announce Type: new Abstract: Specialist epilepsy expertise is scarce in resource-constrained settings, making LLM-based decision support attractive for frontline clinicians managing longitudinal treatment. Such systems must adapt to local prescribing practice… 12 arXiv — Machine Learning research 1d ago TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning arXiv:2606.32017v1 Announce Type: new Abstract: Agentic reinforcement learning requires assigning credit to environment-facing actions such as searches, clicks, edits, navigation commands, and object interactions. Standard GRPO uses the final verifier outcome as a uniform… 13 arXiv — Machine Learning research 1d ago Explainable Artificial Intelligence For The Detection and Characterisation of Stage B Heart Failure arXiv:2606.30665v1 Announce Type: cross Abstract: Stage B heart failure is characterized by asymptomatic structural or functional cardiac abnormalities. Identifying individuals at this stage is clinically important, as early detection may enable targeted interventions to prevent… 20 arXiv — NLP / Computation & Language research 1d ago Gated Multi-Graph Fusion via Graph Attention Networks for Alzheimer's Disease Detection arXiv:2606.31186v1 Announce Type: new Abstract: Spontaneous speech is a vital non-invasive biomarker for Alzheimer's Disease (AD), yet many systems overlook non-linear structural disruptions and clinical heterogeneity in pathological language. We propose a Multi-View Gated Graph… 31 arXiv — NLP / Computation & Language research 1d ago Clinically Structured Rank-Gated LoRA for Cross-Benchmark Medical Question Answering arXiv:2606.31432v1 Announce Type: new Abstract: Medical multiple-choice question answering requires parameter-efficient adaptation across heterogeneous knowledge domains and reasoning operations. A medication question, a diagnostic decision, a public-health item, and a… 33 arXiv — NLP / Computation & Language research 1d ago CLExEval: A Human-in-the-Loop Framework for Qualitative Evaluation of LLM Clinical Reasoning arXiv:2606.31608v1 Announce Type: new Abstract: Large Language Models (LLMs) achieve strong results on many medical benchmarks, but their clinical reasoning remains difficult to evaluate reliably. A central risk is an evaluation illusion: fluent and well-structured explanations… 37 Vercel — AI dev-tools 1d ago Enforce consistent code for agents and humans with konsistent konsistent is now open source. konsistent is a CLI linter for TypeScript codebases that enforces structural conventions, giving agents and humans the consistent context they need to implement features correctly. Deterministic, fast, and covers structural patterns that TypeScript… 6 Vercel — AI dev-tools 1d ago Dry-run deployments with Vercel CLI You can now preview the framework preset and files that Vercel CLI includes in a deployment before creating one. Run vercel deploy --dry from a linked project: For automation or further inspection, return the complete file manifest as JSON: JSON output includes the detected… 17 Hacker News — AI on Front Page community 1d ago The labor share of income in the US is at its lowest post-war level Article URL: https://libertystreeteconomics.newyorkfed.org/2026/06/the-post-covid-decline-in-the-labor-share/ Comments URL: https://news.ycombinator.com/item?id=48734234 Points: 290 # Comments: 234 31 arXiv — Machine Learning research 2d ago NIVA: A Multimodal Foundation Model for Actionable Earth System Intelligence arXiv:2606.28546v1 Announce Type: new Abstract: Recent advances in AI-driven weather and climate modeling have improved forecast skill while reducing computational cost. However, existing data-driven approaches are limited in their ability to model coupled Earth system dynamics,… 9 arXiv — Machine Learning research 2d ago GLACIER: Rethinking Mass Spectrum Prediction as an Object Detection Problem arXiv:2606.29161v1 Announce Type: new Abstract: Predicting tandem mass spectra (MS/MS) from molecular structures represents a central task in analytical chemistry with direct relevance to clinical metabolomics, systems biology, and adjacent disciplines. In this work, we revisit… 13 arXiv — Machine Learning research 2d ago SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning arXiv:2606.29322v1 Announce Type: new Abstract: Collaborative learning is sustainable only when it benefits each participant. Standard federated learning optimizes a global average objective, which can under perform for clients whose data distributions differ substantially from… 35 arXiv — NLP / Computation & Language research 2d ago A French OSCE Dialogue Dataset and Controllable Virtual Patient System for Clinical Training arXiv:2606.28526v1 Announce Type: new Abstract: The clinical and communication skills of medical students are commonly assessed through Objective Structured Clinical Examinations (OSCEs), which consist of brief scenario-driven simulations of doctor-patient interactions. However,… 36 arXiv — NLP / Computation & Language research 2d ago The strength of clinical evidence is recoverable from language model representations but not from their stated grades arXiv:2606.29034v1 Announce Type: new Abstract: Large language models (LLMs) increasingly summarize clinical evidence, where a claim's weight depends on how strongly it is supported. Yet these models convey confidence poorly, and properties they never state, such as truth, are… 17 arXiv — NLP / Computation & Language research 2d ago TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs arXiv:2606.29375v1 Announce Type: new Abstract: Medical large language models are commonly adapted with a fixed low-rank budget, even though medical questions differ substantially in confidence, clinical coverage, and cross-domain difficulty. We study adaptive rank budgeting for… 15 arXiv — NLP / Computation & Language research 2d ago How much of an LLM-generated clinical corpus is actually new? A production-scale measurement of content redundancy for provenance classification arXiv:2606.29605v1 Announce Type: new Abstract: Clinical machine learning increasingly relies on training corpora generated by large language models (LLMs) rather than annotated by clinicians, and such corpora are described and reused largely on the basis of their reported… 12 arXiv — NLP / Computation & Language research 2d ago Clinical Reasoning Graphs: Structured Evaluation of LLM Diagnostic Reasoning Reveals Competence Without Consistency arXiv:2606.29876v1 Announce Type: new Abstract: Modern large language models (LLMs) reach 60-70% diagnostic accuracy on complex clinical case benchmarks, but accuracy alone cannot distinguish stable clinically-grounded reasoning from pattern matching. We introduce clinical… 10 r/LocalLLaMA community 2d ago Krea-2-Turbo Image Model - Easy to be fully uncensored, but it can also EDIT Images! I've been super impressed with Krea-2-Turbo. It can generate high quality images in ~3 seconds. The quality is quite good compared to other local AI image gen models. Now, I don't want to make you watch or click a you tube video, so I'll just give these clear instructions on how… 5 r/LocalLLaMA community 2d ago Anyone else end up building a web access layer for local AI agents? I've been running local models for most of my experiments, and I kept running into the same issue. The model lives locally, but everything it needs to interact with doesn't. Every new agent ended up with another GitHub client, another Reddit integration, another documentation… 10 Vercel — AI dev-tools 2d ago Query Speed Insights from the Vercel CLI You can now query Speed Insights datapoints directly through the Vercel CLI. Using the vercel metrics command, you can pull core Web Vitals (LCP, INP, CLS) and other page performance metrics (FCP, TTFB) based on client-side measurements from real user traffic. By providing a… 9 arXiv — Machine Learning research 3d ago FoggyTrust: Robust Federated Learning with Hierarchical Trust Networks arXiv:2606.27622v1 Announce Type: new Abstract: Byzantine-robust federated learning seeks to protect distributed model training from malicious or corrupted clients without requiring access to their private data. FLTrust addresses this challenge by introducing a trusted… 33 arXiv — Machine Learning research 3d ago OperatorSHAP: Fast and Accurate Shapley Value Estimation for Neural Operators arXiv:2606.28065v1 Announce Type: new Abstract: Understanding model predictions is essential for physical applications, where outputs often inform safety-critical decisions, such as structural load assessment, weather warnings, and clinical diagnosis. Shapley values satisfy many… 20 arXiv — Machine Learning research 3d ago CPAgents: Agentic Composite Phenotype Generation for Cardiac Disease Association arXiv:2606.28179v1 Announce Type: new Abstract: Identifying robust associations between cardiac imaging phenotypes and clinical diseases is fundamental to population-scale cardiovascular research and reliable risk stratification. However, current phenome-wide association studies… 13 arXiv — NLP / Computation & Language research 3d ago From Black-Box to Clinical Insight: A Multi-Stage Explainable Framework for Speech-Based Cognitive Impairment Detection arXiv:2606.27973v1 Announce Type: new Abstract: Speech-based cognitive impairment detection offers a noninvasive, accessible alternative to costly biomarker assays, yet transformer-based models remain clinically uninterpretable. We propose a multi-stage explainability framework… 23 arXiv — NLP / Computation & Language research 3d ago The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization arXiv:2606.28013v1 Announce Type: new Abstract: Headline type-correctness (TC\%) of LLM autoformalization has climbed from $\sim$53\% to $\sim$76\% in two years, yet this scalar conceals which errors each method resolves. We propose a signal-coverage matrix that crosses the Lean… 23 arXiv — NLP / Computation & Language research 3d ago Aloe-Vision: Robust Vision-Language Models for Healthcare arXiv:2606.27500v1 Announce Type: cross Abstract: Large Vision-Language Models (LVLMs) specialized in healthcare are emerging as a promising research direction due to their potential impact in clinical and biomedical applications. However, progress is constrained by the scarcity… 28 Vercel — AI dev-tools 3d ago xAI Grok audio models now available on Vercel AI Gateway xAI's audio models are now live on AI Gateway. Realtime voice, text to speech, and speech to text are all available through the AI SDK with the same routing, observability, and spend controls as your other models. These capabilities are available on the AI SDK 7 release.… 11 r/MachineLearning community 4d ago I silently break training codes or configs so I made pybench [P] It is like pytest but for statistical tests: it ensures no regression of your metrics at a statistical level. It manages tedious things such that seeds, past benchmark results, ... Simple CLI working like pytest but with benchmarks/ directory instead of tests/: pybench # 1st… 38 r/LocalLLaMA community 5d ago Hello there! (again) i ported my kokoro enhancements so you can use them in your projects. i made a web based and python based version of the enhancements i made to kokoro's controls. both are, of course, fully client side. if you have hardware acceleration turned on in your browser, kokoro runs on webgpu at about 40ms per generation. it's really fast. note: the… 36 Vercel — AI dev-tools 5d ago Query Web Analytics from the Vercel CLI You can now query Web Analytics datapoints directly through the Vercel CLI. Using the vercel metrics command, you can pull page views, visitors, and custom events for your Vercel projects to analyze traffic, compare trends, and answer questions about site performance. By… 14 r/MachineLearning community 5d ago Made a free tool that automatically cuts the best clips from long videos — thought this community might find it useful [P] I edit a lot of long-form content and got tired of scrubbing through hour-long recordings to find the good moments. So I built something to do it. You give it a video file (or a YouTube link), it figures out which parts are actually worth watching, and exports short clips in… 22 arXiv — Machine Learning research 6d ago Beyond Feedforward Networks: Reentry Neural Systems as the Fundamental Basis of Subjecthood and Intrinsic Safety of Next-Generation AGI arXiv:2606.26406v1 Announce Type: new Abstract: We propose a complete architectural blueprint for safe artificial general intelligence based on a closed reentry loop (D I cycle). In contrast to feedforward networks, which are directed acyclic graphs (C=0, S=0) incapable of… 37 arXiv — NLP / Computation & Language research 6d ago Context Recycling for Long-Horizon LLM Inference arXiv:2606.26105v1 Announce Type: new Abstract: Large language models (LLMs) exhibit strong capabilities in short-context reasoning but degrade in performance over long conversational horizons due to context window limitations and inefficient token usage. We introduce… 27 arXiv — Machine Learning research 6d ago Dot-Flik: A Scalable Edge AI Architecture for Distributed Insect Monitoring arXiv:2606.26121v1 Announce Type: cross Abstract: Global insect population declines necessitate scalable, continuous monitoring systems, yet existing vision-based solutions remain constrained by high hardware costs, energy demands, and reliance on centralized processing or cloud… 11 arXiv — NLP / Computation & Language research 6d ago Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News arXiv:2606.26489v1 Announce Type: new Abstract: News media play a central role in shaping public perceptions of climate change, and whether coverage emphasizes threats or solutions has measurable effects on audience engagement and policy support. Automated detection of these… 23 arXiv — NLP / Computation & Language research 6d ago SamaVaani: Auditing and Debiasing Multilingual Clinical ASR for Indian Languages arXiv:2606.26901v1 Announce Type: new Abstract: Automatic Speech Recognition (ASR) is increasingly used to document clinical encounters, yet its reliability in multilingual and demographically diverse Indian healthcare context remains largely unknown. In this study, we first… 6 arXiv — NLP / Computation & Language research 6d ago From Clicks to Intent: Cross-Platform Session Embeddings with LLM-Distilled Taxonomy for Financial Services Recommendations arXiv:2606.26277v1 Announce Type: cross Abstract: Sequential user behavior modeling is widely adopted in industrial recommender systems; however, significant gaps remain in financial services, where pre-login web interactions and authenticated in-app experiences differ… 24 arXiv — NLP / Computation & Language research 6d ago Somatic in the East, Psychological in the West?: Investigating Clinically-Grounded Cross-Cultural Depression Symptom Expression in LLMs arXiv:2508.03247v2 Announce Type: replace Abstract: Prior clinical psychology research shows that Western individuals with depression tend to report psychological symptoms, while Eastern individuals report somatic ones. We test whether Large Language Models (LLMs), which are… 5 Page 1 of 10 · 500 articles Older →