Tag

Research papers

500 articles archived under #paper · RSS

arXiv — Machine Learning research 2d ago

Towards Evaluating Data Priors for Tabular Foundation Models

arXiv:2606.29241v1 Announce Type: new Abstract: Data-generating priors are a central component of tabular foundation models because they define the task distribution used during pretraining. However, priors are rarely evaluated as independent components, making it difficult to…

12
arXiv — Machine Learning research 2d ago

KrishokChat: A Citation-Grounded Dataset and Benchmark for Bengali Agricultural Advisory

arXiv:2606.29243v1 Announce Type: new Abstract: We present KrishokChat, the first citation-grounded Bengali agricultural instruction-tuning dataset for crop advisory in low-resource settings. We establish a foundation of 290 hierarchical Knowledge Nodes, extracting disease…

30
arXiv — Machine Learning research 2d ago

When Prices Double in a Week: Forecasting of Agricultural Volatility in Import-Isolated Markets

arXiv:2606.29248v1 Announce Type: new Abstract: Vegetable prices in Sri Lanka are highly volatile because the market is largely import-isolated, so supply disruptions quickly drive prices up. This study develops a machine learning framework to forecast such volatility by…

8
arXiv — Machine Learning research 2d ago

Learning to Bid in Discriminatory Auctions with Budget Constraints

arXiv:2606.29252v1 Announce Type: new Abstract: We study repeated bidding in multi-unit discriminatory (pay-as-bid) auctions for a single bidder with per-round utility equal to value minus $\alpha$ times payment, where $\alpha\in[0,1]$ is a cost-of-capital parameter. The bidder…

7
arXiv — Machine Learning research 2d ago

Nonlinear mixture model motivated subspace clustering

arXiv:2606.29261v1 Announce Type: new Abstract: We derive the linear union-of-subspaces (UoS) model for subspace clustering (SC) from the nonlinear mixture model (NMM) used in blind source separation (BSS) to represent a D-dimensional observation vector as an unknown…

7
arXiv — Machine Learning research 2d ago

PCGD: Physics-Guided Conditional Graph Diffusion for TCAD Device Simulation

arXiv:2606.29272v1 Announce Type: new Abstract: Technology computer-aided design (TCAD) semiconductor device simulation is fundamentally constrained by the high computational cost of iteratively solving coupled drift-diffusion equations. Existing ML surrogates either reduce…

33
arXiv — Machine Learning research 2d ago

Adaptive Block Diffusion: Resolving Training-Inference Mismatch in Diffusion Language Models

arXiv:2606.29275v1 Announce Type: new Abstract: Diffusion Language Models (DLMs) are typically trained under fixed context structures, restricting denoising to predetermined token subsets. This creates a mismatch between training and inference, where models must operate over…

37
arXiv — Machine Learning research 2d ago

Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning

arXiv:2606.29280v1 Announce Type: new Abstract: We identify intervention bias as a previously unquantified failure mode of zero-shot large-language-model (LLM) educational advisory agents: without task-specific training, they recommend action when a hindsight-optimal oracle…

31
arXiv — Machine Learning research 2d ago

Beyond Trajectory Matching: Reflow with Marginal Distribution Alignment

arXiv:2606.29287v1 Announce Type: new Abstract: Diffusion and continuous-flow generative models achieve high-quality generation, and their deterministic sampling can be formulated as solving learned ODE dynamics. However, accurate ODE discretization often requires many steps,…

36
arXiv — Machine Learning research 2d ago

SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning

arXiv:2606.29322v1 Announce Type: new Abstract: Collaborative learning is sustainable only when it benefits each participant. Standard federated learning optimizes a global average objective, which can under perform for clients whose data distributions differ substantially from…

35
arXiv — Machine Learning research 2d ago

Deciphering Region-Level Signatures from Latency Measurements in LEO Satellite Internet

arXiv:2606.29324v1 Announce Type: new Abstract: Low-Earth orbit (LEO) satellite Internet has become an indispensable infrastructure that provide growing coverage for global users. Despite extensive measurement efforts, the principles underlying region-level performance…

32
arXiv — Machine Learning research 2d ago

Sample Complexity of Scientific Discovery: PAC Learnability of Compositional Function Trees

arXiv:2606.29331v1 Announce Type: new Abstract: Scientific discovery via symbolic regression is often viewed as statistically and computationally intractable because the hypothesis space of expressions grows combinatorially with depth. This paper revisits the statistical side…

33
arXiv — Machine Learning research 2d ago

AMR: Adaptive Modality Routing for Multimodal Polyglot Speaker Identification

arXiv:2606.29335v1 Announce Type: new Abstract: Multimodal speaker identification systems face two key challenges in real-world deployment: missing modalities and language mismatch between training and testing conditions. In practical scenarios, background multi-speaker…

14
arXiv — Machine Learning research 2d ago

Reliability, Faithfulness, and the Limits of Post-hoc Explanations of Opaque Scientific Models

arXiv:2606.29346v1 Announce Type: new Abstract: Post-hoc explanation methods are routinely used to interpret scientific machine learning models, with the deliverable understood to be insight into the phenomenon the model has been trained on. The transition may be taken to be…

22
arXiv — Machine Learning research 2d ago

Adaptive Financial Transformer with Regime-Gated Attention for Stock Return Prediction

arXiv:2606.29347v1 Announce Type: new Abstract: Adaptive Financial Transformer (AFT) is proposed for stock return prediction under non-stationary financial markets. The model incorporates a Market Regime Encoder, an Adaptive Gate Network, and an Adaptive Financial Context module…

17
arXiv — Machine Learning research 2d ago

Interventional Flow Matching: Prospective Dose-Response Forecasting with Velocity-Field Jacobian Regularization

arXiv:2606.29386v1 Announce Type: new Abstract: Predicting a patient's physiological trajectory under a planned treatment sequence is a prospective interventional problem, not standard time-series extrapolation. We study this problem in glucose management, where insulin and…

20
arXiv — Machine Learning research 2d ago

Temporal Posed and Spontaneous Gesture Recognition from Electromyography in the Rock-Paper-Scissors Game

arXiv:2606.29423v1 Announce Type: new Abstract: The importance of gesture recognition has been acknowledged in many domains requiring real-time recognition systems. Two requirements for these are fast recognition in multiuser contexts. Therefore, we explored the temporal…

4
arXiv — Machine Learning research 2d ago

Randomized neural operator for parametric PDEs with fast training and conformal uncertainty quantification

arXiv:2606.29440v1 Announce Type: new Abstract: Repeatedly solving parametric PDEs is essential for uncertainty quantification, design optimization and inverse problems, but conventional neural operators require expensive non-convex training. We introduce PCA--RaNN, a randomized…

16
arXiv — Machine Learning research 2d ago

Interpretable Inverse Design of Metal-Organic Frameworks with Large Language Model Agents

arXiv:2606.29459v1 Announce Type: new Abstract: Inverse design of metal-organic frameworks (MOFs) requires searching a combinatorially vast space where property labels are expensive and most machine-learning models reveal little about why a structure succeeds. We introduce…

8
arXiv — Machine Learning research 2d ago

Prototype Latent World Model Replay for Class-Incremental Learning

arXiv:2606.29465v1 Announce Type: new Abstract: Class-incremental learning requires a model to learn new classes while preserving decision regions for old ones. This is difficult when raw old samples are no longer available. We propose Prototype Latent World Model Replay, a…

8
arXiv — Machine Learning research 2d ago

Self-Supervised Calibration of Scientific Instruments Using Physical Consistency Constraints

arXiv:2606.29466v1 Announce Type: new Abstract: Calibration remains one of the principal obstacles to the deployment of machine learning in scientific instrumentation because it typically relies on expert intervention, dedicated procedures, and manually labelled data. We…

13
arXiv — Machine Learning research 2d ago

Structured Proper Loss Geometries for Multiclass Classification: Theory and Controlled Empirical Evaluation

arXiv:2606.29471v1 Announce Type: new Abstract: Strictly proper scoring rules identify the true conditional class distribution at population level, but their curvature can alter optimization and finite-sample behavior. We study three multiclass objectives: a class-aware…

23
arXiv — Machine Learning research 2d ago

CRAFT: Counterfactual Credit Assignment from Free Sibling Rollouts for Self-Distilled Agentic Reinforcement Learning

arXiv:2606.29476v1 Announce Type: new Abstract: Self-distilled agentic reinforcement learning augments trajectory-level reward with a token-level distillation loss, using as its teacher the same policy conditioned on privileged context. The prevailing recipe gates this loss by a…

24
arXiv — Machine Learning research 2d ago

Reported Confidence in LLMs Tracks Commitment More Than Correctness

arXiv:2606.29490v1 Announce Type: new Abstract: Confidence is an estimate of the probability that a chosen answer is correct. Verbal confidence reports are widely used as uncertainty measures in large language models, but whether they are best understood as estimates of…

33
arXiv — Machine Learning research 2d ago

Reinforcement Learning in Super Mario Bros: Curriculum, Pedagogy, and Optimal Level Design in World 1-1

arXiv:2606.29511v1 Announce Type: new Abstract: World 1-1 of Super Mario Bros is widely celebrated as a masterclass in game design: its progressive structure is credited with teaching players core mechanics through the level itself. We ask whether that structure is empirically…

22
arXiv — Machine Learning research 2d ago

A Mathematical Optimization Approach for Expert-Informed Bayesian Best Subset Selection

arXiv:2606.29516v1 Announce Type: new Abstract: A central challenge in statistical modeling is identifying the subset of features that belong in the true regression model. The classical best subset selection problem, recently made tractable via mixed-integer optimization (MIO),…

34
arXiv — Machine Learning research 2d ago

Anti-Collapse Dynamics and the Emergence of Multi-Time-Scale Learning in Recurrent Neural Networks

arXiv:2606.29519v1 Announce Type: new Abstract: Long-range learning is hard for recurrent networks trained with stochastic gradient descent, because the influence of a past input fades with the lag $\ell$, and if it fades too fast the dependence cannot be learned from finite…

28
arXiv — Machine Learning research 2d ago

Not All Objectives Are Born Equal: Priority-Constrained Descent for Hierarchical Multi-Objective Optimization

arXiv:2606.29521v1 Announce Type: new Abstract: Deep learning problems rarely involve objectives that are equal in importance. A primary objective defines the goal, whilst secondary objectives, such as sparsity, compression, or robustness constrain the solution. While existing…

31
arXiv — Machine Learning research 2d ago

Do Models Read What They Write? Causal Registers in Scratchpad Reasoning

arXiv:2606.29522v1 Announce Type: new Abstract: A central hope behind process supervision is that models can expose intermediate variables that matter for their later behavior. For this to help with alignment, a scratchpad must be tied to the computation: when the model writes a…

29
arXiv — Machine Learning research 2d ago

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

arXiv:2606.29526v1 Announce Type: new Abstract: Reinforcement learning (RL) has gained growing attention in large language model (LLM) post-training, yet RL training remains fragile and can suffer from instability or collapse. One vital cause is training-inference mismatch: LLM…

17
arXiv — Machine Learning research 2d ago

VISTA-DZ: Visual Semantic Trajectory Adaptation for Personalized Dilemma Zone Prediction

arXiv:2606.29548v1 Announce Type: new Abstract: Driver decision making in the dilemma zone at signalized intersections is safety critical, as vehicles approaching a yellow signal must decide whether to stop or proceed within limited time and distance margins. Accurate prediction…

38
arXiv — Machine Learning research 2d ago

Optimizer Memory Makes Shuffle Order a First-Order Source of Fine-Tuning Noise

arXiv:2606.29554v1 Announce Type: new Abstract: Shuffle order can be a larger source of fine-tuning noise than a memoryless analysis predicts: fixed-clock optimizer memory makes local equal-multiset contrasts first order in the learning rate rather than second order, and the…

8
arXiv — NLP / Computation & Language research 2d ago

Generating in the Limit with Infinitely Many Hallucinations

arXiv:2606.28354v1 Announce Type: new Abstract: The classic paradigm of language identification in the limit models learning as a game between an adversary, who reveals strings from an unknown target language, and a learner tasked with identifying that language. The recently…

10
arXiv — NLP / Computation & Language research 2d ago

Extracting Knowledge from an Arabic-English Machine-Readable Dictionary Using Information Extraction

arXiv:2606.28457v1 Announce Type: new Abstract: Natural language processing (NLP) applications need large and rich amount of linguistic knowledge. Furthermore, electronic language sources such as dictionaries, encyclopedia, and corpora became available. So, automatic methods are…

28
arXiv — NLP / Computation & Language research 2d ago

Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models

arXiv:2606.28524v1 Announce Type: new Abstract: Recent work suggests that Large Language Models (LLMs) are sensitive to the belief states of agents described by text, as measured by the false belief task (FBT), yet persistent concerns of construct validity remain. We adopt a…

25
arXiv — NLP / Computation & Language research 2d ago

A French OSCE Dialogue Dataset and Controllable Virtual Patient System for Clinical Training

arXiv:2606.28526v1 Announce Type: new Abstract: The clinical and communication skills of medical students are commonly assessed through Objective Structured Clinical Examinations (OSCEs), which consist of brief scenario-driven simulations of doctor-patient interactions. However,…

36
arXiv — NLP / Computation & Language research 2d ago

Legal Domain Adaptation of Modern BERT Models

arXiv:2606.28538v1 Announce Type: new Abstract: We investigate domain adaptation of modern BERT models in the legal domain. We further pre-train ModernBERT on all US court opinions using the masked language modeling objective. Although ModernBERT has been trained on roughly 500x…

26
arXiv — NLP / Computation & Language research 2d ago

Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution

arXiv:2606.28548v1 Announce Type: new Abstract: Sparse autoencoders (SAEs) have become a useful tool for extracting interpretable features in language models. However, standard SAE architectures operate on individual token activations, meaning that the number of active features…

25
arXiv — NLP / Computation & Language research 2d ago

Depth-Staggered Fibonacci Spacing for Sparse Attention: Static Schedules Beat Learned Dilation and Extrapolate Where Dense Attention Fails

arXiv:2606.28560v1 Announce Type: new Abstract: We study sparse self-attention in which each query attends to a dense local window plus a set of Fibonacci-spaced offsets, with a per-layer scalar alpha that compresses or expands the spacing. Across 21 language models trained…

20
arXiv — NLP / Computation & Language research 2d ago

SEAD: Competence-Aware On-Policy Distillation via Entropy-Guided Supervision

arXiv:2606.28562v1 Announce Type: new Abstract: On-policy distillation (OPD) has a property absent in offline distillation and RL: teacher supervision quality depends on student competence. Incoherent rollouts yield noisy gradients; already-mastered tokens yield redundant ones.…

10
arXiv — NLP / Computation & Language research 2d ago

Correct codes for the wrong reasons? validating LLMs as measurement instruments for theoretical constructs

arXiv:2606.28574v1 Announce Type: new Abstract: When a large language model (LLM) codes a construct in text as a human annotator would, that agreement makes the LLM a reliable coder. Yet reliability leaves construct validity untouched. The instrument may be theory-naive,…

35
arXiv — NLP / Computation & Language research 2d ago

Phonological Perception of Sign Language Models

arXiv:2606.28667v1 Announce Type: new Abstract: Sign languages are compositional systems where meaning arises by combining sublexical phonological parameters, such as handshape, location, and movement. While deep learning models for Sign Language Recognition (SLR) have achieved…

38
arXiv — NLP / Computation & Language research 2d ago

AnTenA: Actionable and Explainable Tensor Analysis System with Large Language Models

arXiv:2606.28708v1 Announce Type: new Abstract: Accurately explaining hidden patterns in multi-aspect data has typically been done by leveraging labels and/or accompanying auxiliary metadata. However, labels and auxiliary data may be inaccurate (e.g. nonstandard, inconsistent),…

21
arXiv — NLP / Computation & Language research 2d ago

SEATauBench: Adapting Tool-Agent-User Evaluation Into Low-Resource Southeast Asian Languages

arXiv:2606.28715v1 Announce Type: new Abstract: While AI development and evaluation for Southeast Asia (SEA) has grown rapidly, agent capabilities in regional languages are still poorly understood despite its importance to sovereign AI. To fill this gap, we introduce…

28
arXiv — NLP / Computation & Language research 2d ago

DriftGuard: Safety-Aware Multi-Monitor Detection and Selective Adaptation for Evolving Toxicity Moderation

arXiv:2606.28725v1 Announce Type: new Abstract: Automated toxicity moderation systems operate in dynamic online environments where harmful behavior evolves through coded language, shifting targets, and strategic adaptation to enforcement. Existing drift detection methods often…

12
arXiv — NLP / Computation & Language research 2d ago

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control

arXiv:2606.28737v1 Announce Type: new Abstract: We introduce 5ting, our system for the SemEval2026 Task 8 (MTRAGEval), which evaluates multi-turn Retrieval Augmented Generation (RAG) systems. Multi turn RAG involves context drift, under specification, and hallucination risk. Our…

5
arXiv — NLP / Computation & Language research 2d ago

Majority Vote Silences Minority Values: Annotator Disagreement at the Hate/Offensive Boundary in HateXplain

arXiv:2606.28772v1 Announce Type: new Abstract: Hate speech annotation pipelines routinely collapse annotator disagreement into majority vote labels before training. We show that this aggregation is not neutral: 42.6% of all annotator disagreement in HateXplain concentrates…

28
arXiv — NLP / Computation & Language research 2d ago

Structure-Preserving Document Translation via Multi-Stage LLM Pipeline: A Case Study in Marathi

arXiv:2606.28796v1 Announce Type: new Abstract: Government documents in India are predominantly issued in regional languages such as Marathi, creating substantial accessibility barriers for non-native readers, interstate administrative bodies, and policy analysts. Although…

30
arXiv — NLP / Computation & Language research 2d ago

Labeling Training Data for Entity Matching Using Large Language Models

arXiv:2606.28823v1 Announce Type: new Abstract: Recent large language models (LLMs) achieve strong performance on entity matching without requiring task-specific training data. However, applying these models to large sets of candidate pairs remains slow and costly. In contrast,…

9
arXiv — NLP / Computation & Language research 2d ago

The Heterogeneous Safety Impacts of Benign Multilingual Fine-Tuning

arXiv:2606.28843v1 Announce Type: new Abstract: Fine-tuning a large language model is a ubiquitous method for enhancing its capability on a specific downstream task. However, prior work has shown that this increase in capability comes with a cost: it can increase a model's…

18

Towards Evaluating Data Priors for Tabular Foundation Models

KrishokChat: A Citation-Grounded Dataset and Benchmark for Bengali Agricultural Advisory

When Prices Double in a Week: Forecasting of Agricultural Volatility in Import-Isolated Markets

Learning to Bid in Discriminatory Auctions with Budget Constraints

Nonlinear mixture model motivated subspace clustering

PCGD: Physics-Guided Conditional Graph Diffusion for TCAD Device Simulation

Adaptive Block Diffusion: Resolving Training-Inference Mismatch in Diffusion Language Models

Deterministic Decisions for High-Stakes AI. A Zero-Egress Pipeline with the Deployability of RAG and the Accuracy of Machine Learning

Beyond Trajectory Matching: Reflow with Marginal Distribution Alignment

SP-CACW: Convergence-Aware Client Weighting for Selfish Personalized Learning

Deciphering Region-Level Signatures from Latency Measurements in LEO Satellite Internet

Sample Complexity of Scientific Discovery: PAC Learnability of Compositional Function Trees

AMR: Adaptive Modality Routing for Multimodal Polyglot Speaker Identification

Reliability, Faithfulness, and the Limits of Post-hoc Explanations of Opaque Scientific Models

Adaptive Financial Transformer with Regime-Gated Attention for Stock Return Prediction

Interventional Flow Matching: Prospective Dose-Response Forecasting with Velocity-Field Jacobian Regularization

Temporal Posed and Spontaneous Gesture Recognition from Electromyography in the Rock-Paper-Scissors Game

Randomized neural operator for parametric PDEs with fast training and conformal uncertainty quantification

Interpretable Inverse Design of Metal-Organic Frameworks with Large Language Model Agents

Prototype Latent World Model Replay for Class-Incremental Learning

Self-Supervised Calibration of Scientific Instruments Using Physical Consistency Constraints

Structured Proper Loss Geometries for Multiclass Classification: Theory and Controlled Empirical Evaluation

CRAFT: Counterfactual Credit Assignment from Free Sibling Rollouts for Self-Distilled Agentic Reinforcement Learning

Reported Confidence in LLMs Tracks Commitment More Than Correctness

Reinforcement Learning in Super Mario Bros: Curriculum, Pedagogy, and Optimal Level Design in World 1-1

A Mathematical Optimization Approach for Expert-Informed Bayesian Best Subset Selection

Anti-Collapse Dynamics and the Emergence of Multi-Time-Scale Learning in Recurrent Neural Networks

Not All Objectives Are Born Equal: Priority-Constrained Descent for Hierarchical Multi-Objective Optimization

Do Models Read What They Write? Causal Registers in Scratchpad Reasoning

The Mirage of Optimizing Training Policies: Monotonic Inference Policies as the Real Objective for LLM Reinforcement Learning

VISTA-DZ: Visual Semantic Trajectory Adaptation for Personalized Dilemma Zone Prediction

Optimizer Memory Makes Shuffle Order a First-Order Source of Fine-Tuning Noise

Generating in the Limit with Infinitely Many Hallucinations

Extracting Knowledge from an Arabic-English Machine-Readable Dictionary Using Information Extraction

Developmental Trajectories of Situation Modeling and Mentalizing in Transformer Language Models

A French OSCE Dialogue Dataset and Controllable Virtual Patient System for Clinical Training

Legal Domain Adaptation of Modern BERT Models

Turn-Averaged SAEs for Feature Discovery and Long-Context Attribution

Depth-Staggered Fibonacci Spacing for Sparse Attention: Static Schedules Beat Learned Dilation and Extrapolate Where Dense Attention Fails

SEAD: Competence-Aware On-Policy Distillation via Entropy-Guided Supervision

Correct codes for the wrong reasons? validating LLMs as measurement instruments for theoretical constructs

Phonological Perception of Sign Language Models

AnTenA: Actionable and Explainable Tensor Analysis System with Large Language Models

SEATauBench: Adapting Tool-Agent-User Evaluation Into Low-Resource Southeast Asian Languages

DriftGuard: Safety-Aware Multi-Monitor Detection and Selective Adaptation for Evolving Toxicity Moderation

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control

Majority Vote Silences Minority Values: Annotator Disagreement at the Hate/Offensive Boundary in HateXplain

Structure-Preserving Document Translation via Multi-Stage LLM Pipeline: A Case Study in Marathi

Labeling Training Data for Entity Matching Using Large Language Models

The Heterogeneous Safety Impacts of Benign Multilingual Fine-Tuning