Tag

Research papers

500 articles archived under #paper · RSS

arXiv — Machine Learning research 1d ago

A Transferable Learned Temporal Prior for Transmission Reconstruction and Decision-Relevant Uncertainty in Real Outbreak Labels

arXiv:2606.30842v1 Announce Type: new Abstract: Outbreak transmission reconstruction treats epidemiological timing and transmission labels as deterministic ground truth; neither has been systematically evaluated. We trained a logistic regression temporal prior on eleven disease…

22
arXiv — Machine Learning research 1d ago

Behavior Cloning is Not All You Need: The Optimality of On-Policy Distillation for Noisy Expert Feedback

arXiv:2606.30923v1 Announce Type: new Abstract: Imitation Learning is a natural framework for learning in sequential decision-making systems and has emerged as the dominant paradigm through which we understand language model training. A central puzzle is that, while in theory…

10
arXiv — Machine Learning research 1d ago

Personalizing Marketplace Policies with Competing Objectives and Constrained Experiments: Evidence from a Job Marketplace

arXiv:2606.30932v1 Announce Type: new Abstract: Two-sided marketplaces connect distinct user groups whose interests often conflict -- improving outcomes on one side could degrade the other side's experience. To address this challenge, we deploy an integrated framework for…

31
arXiv — Machine Learning research 1d ago

Quality-Aware Modulation for Diffusion Transformers

arXiv:2606.30934v1 Announce Type: new Abstract: Modern text-to-image diffusion models, such as diffusion transformers (DiT), rely on timestep or prompt embeddings to modulate the strength of the denoising process in each timestep. While this modulation communicates the current…

31
arXiv — Machine Learning research 1d ago

Physics-informed Conditional Normalizing Flows for Angles-only Cislunar Orbit Determination

arXiv:2606.30936v1 Announce Type: new Abstract: Generative Astrodynamics is advanced in this work by extending generative modelling to an orbit determination problem in the cislunar environment. The task is formulated as conditional density estimation, aiming to infer the…

38
arXiv — Machine Learning research 1d ago

Multistage Defer Trees for Hybrid Interpretability: If at First You Can't Succeed, Tree Again

arXiv:2606.30995v1 Announce Type: new Abstract: Recent work has shown that well-optimized individual decision trees can match complex black box models in some settings, primarily in noisy domains. For the remaining settings, however, complex ensembled compositions of trees often…

26
arXiv — Machine Learning research 1d ago

Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach

arXiv:2606.30999v1 Announce Type: new Abstract: In two-sided marketplaces with heterogeneous products, it is important to understand the causal relationship between additional supply and marketplace outcomes, such as the total quantity transacted or transaction value in the…

26
arXiv — Machine Learning research 1d ago

Offline Reinforcement Learning for Fluid Controls: Data-based Multi-observational Policy Extraction

arXiv:2606.31025v1 Announce Type: new Abstract: Active flow control is a fundamental application in engineering. Recent advances in deep reinforcement learning have made progress in this field. However, the classical online RL approaches require extensive real-time interactions…

19
arXiv — Machine Learning research 1d ago

OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models

arXiv:2606.31026v1 Announce Type: new Abstract: We propose OTCache, a training-free framework for accelerating diffusion sampling via caching schedule prediction. Existing graph-based caching methods reduce redundant computation by optimizing shortest-path objectives, but rely…

34
arXiv — Machine Learning research 1d ago

Teaching LLMs to Recommend and Defer in Underrepresented Epilepsy Care

arXiv:2606.31036v1 Announce Type: new Abstract: Specialist epilepsy expertise is scarce in resource-constrained settings, making LLM-based decision support attractive for frontline clinicians managing longitudinal treatment. Such systems must adapt to local prescribing practice…

12
arXiv — Machine Learning research 1d ago

Warp RL: Reshaping Base Policy Distributions for Dynamics Adaptation

arXiv:2606.31043v1 Announce Type: new Abstract: Residual reinforcement learning adapts a pretrained robot policy by learning an additive correction to its actions. While effective when adaptation amounts to shifting the base policy's action distribution, additive corrections…

26
arXiv — Machine Learning research 1d ago

Knowledge Distillation from Large Reasoning Models to Compact Student Models: A Case Study on the John O Bryan Mathematics Competition

arXiv:2606.31048v1 Announce Type: new Abstract: This paper investigates knowledge distillation from a large reasoning model (DeepSeek-R1) to a compact student model (Qwen2.5-7B). Using historical problems from the John O'Bryan Mathematics Competition at Northern Kentucky…

7
arXiv — Machine Learning research 1d ago

Fora: From Weight-Space to Function-Space Protection in Capability-Preserving Fine-Tuning

arXiv:2606.31092v1 Announce Type: new Abstract: Full fine-tuning adapts large language models to new tasks but can erode capabilities they already possess. Existing remedies protect through proxies such as parameter distances, importance penalties, output matching, or dominant…

11
arXiv — Machine Learning research 1d ago

Explaining Machine Learning and Memorization with Statistical Mechanics

arXiv:2606.31110v1 Announce Type: new Abstract: Artificial neural networks (NNs) and machine learning (ML) algorithms are poorly understood from a theoretical perspective, which makes it difficult to fully realize their potential and overcome their weaknesses. For instance, ML…

8
arXiv — Machine Learning research 1d ago

Visualizing High-Dimensional Graph Embeddings via Informed Multi-View Projections

arXiv:2606.31119v1 Announce Type: new Abstract: Graphs are commonly visualized in 2D, where humans readily interpret spatial relationships, yet such layouts often distort higher-dimensional structure. We propose to embed graphs in high-dimensional space and search for…

38
arXiv — Machine Learning research 1d ago

Can Tabular In-Context Learners Generalize to Biomolecular Property Prediction?

arXiv:2606.31126v1 Announce Type: new Abstract: Predicting biomolecular properties from limited labeled data is a central bottleneck in protein engineering and small-molecule design. As strong pretrained encoders now supply rich fixed-length representations, the difficulty has…

28
arXiv — Machine Learning research 1d ago

A Bayesian Filtering Approach for Learning Lagrangian Dynamics from Noisy Measurements

arXiv:2606.31137v1 Announce Type: new Abstract: This paper proposes a Bayesian filtering-based approach for learning the dynamics of a physical system from partial, noisy measurements. We model the system dynamics using a Lagrangian mechanics formulation. As in Lagrangian neural…

38
arXiv — Machine Learning research 1d ago

PPT-Eval: A Benchmark for Computer-Use Agents on PowerPoint Tasks

arXiv:2606.31154v1 Announce Type: new Abstract: Creating and editing slides is a rich, multimodal activity that is ubiquitous in professional and educational settings, making it an ideal testbed for real-world computer-use agents. Microsoft PowerPoint is among the most widely…

25
arXiv — NLP / Computation & Language research 1d ago

ComplianceGate: Classifier-Gated Multi-Tier LLM Routing for Inference in Regulated Industries

arXiv:2606.31163v1 Announce Type: cross Abstract: Large language models deployed in regulated industries operate under two constraints: compliance enforcement and cost efficiency. Personally identifiable information (PII) in user queries can reach model endpoints before the…

14
arXiv — Machine Learning research 1d ago

AETDICE: Unified Framework and Offline Optimization for Nonlinear Multi-Objective RL

arXiv:2606.31178v1 Announce Type: new Abstract: Optimizing nonlinear preferences in multi-objective reinforcement learning (MORL) is essential for capturing complex trade-offs like risk aversion or fairness. However, such non-linearity has historically bifurcated nonlinear MORL…

31
arXiv — Machine Learning research 1d ago

Transformers as Bayesian In-Context Experimenters: Smoothness-Adaptive Efficient ATE Estimation

arXiv:2606.31184v1 Announce Type: new Abstract: Adaptive experiments for average treatment effects (ATE) require randomized allocations balancing valid inference with statistical efficiency. The oracle design is a covariate-dependent Neyman rule governed by unknown…

18
arXiv — Machine Learning research 1d ago

ISM:Self-Improving Strategy Memory for Continual Mathematical Reasoning

arXiv:2606.31191v1 Announce Type: new Abstract: We propose Intelligent Schema Memory (ISM), a self-evolving memory-augmented system that improves mathematical reasoning for a frozen LLM under continual learning with hard episodic resets. ISM maintains a compact, self-refined…

34
arXiv — Machine Learning research 1d ago

Probing Memorization of Tabular In-Context Learning

arXiv:2606.31208v1 Announce Type: new Abstract: Large tabular models (LTMs), i.e., tabular foundation models leveraging in-context learning (ICL), achieve state-of-the-art performance on tabular tasks. While LLMs are known to unintentionally memorize training data, the…

19
arXiv — Machine Learning research 1d ago

Learning Gaussian Graphical Models from a Glauber Trajectory Without Mixing

arXiv:2606.31230v1 Announce Type: new Abstract: We study the task of learning the structure of a $d$-sparse Gaussian graphical model on $n$ variables from a single trajectory of Glauber dynamics. Beyond algorithmic considerations, many applications present temporally correlated…

28
arXiv — Machine Learning research 1d ago

TDGT: A Tabular Data Generation Toolkit supporting adaptive GPU-accelerated Bayesian mixture models, diffusion-based models, and latent-space generative modeling

arXiv:2606.31268v1 Announce Type: new Abstract: The growing demand for privacy-preserving data sharing has positioned synthetic data generation as a critical component of responsible AI workflows. Despite notable advances in generative modeling, existing solutions often lack…

29
arXiv — Machine Learning research 1d ago

The Calibration Turn in AI-Assisted Research: A Conceptual and Methodological Framework for Evidence-Licensed Claims

arXiv:2606.31273v1 Announce Type: new Abstract: AI-assisted research has entered a stage in which the central question is not only whether systems can generate hypotheses, run experiments, or produce manuscripts, but whether their scientific claims are calibrated to the evidence…

37
arXiv — Machine Learning research 1d ago

Revisiting the Volume Hypothesis

arXiv:2606.31282v1 Announce Type: new Abstract: Modern deep neural networks often contain far more parameters than needed to fit their training data, yet they achieve impressive generalization. A common explanation for this success is the implicit bias of stochastic gradient…

31
arXiv — Machine Learning research 1d ago

Sequential sparse Gaussian process quantile regression

arXiv:2606.31284v1 Announce Type: new Abstract: Quantile regression aims to estimate the conditional quantiles of a response variable from observed data. In a Bayesian setting, Gaussian process quantile regression provides uncertainty quantification but faces significant…

37
arXiv — Machine Learning research 1d ago

Probabilistic Inversion with Flow Matching

arXiv:2606.31288v1 Announce Type: new Abstract: We demonstrate the application of Flow Matching, a technique originating from generative Artificial Intelligence, to probabilistic inversion in geophysical settings, such as seismic Full-Waveform inversion. We adapt the…

7
arXiv — Machine Learning research 1d ago

Patch-PODiff-ViT: Structured Latent Diffusion with Patchwise POD for Super-Resolution and Uncertainty Quantification

arXiv:2606.31290v1 Announce Type: new Abstract: Diffusion models enable probabilistic super-resolution and conditional generation, but pixel-space methods are computationally expensive and learned latent spaces often lack interpretable uncertainty quantification. We introduce…

7
arXiv — Machine Learning research 1d ago

Deep Reinforcement Learning for Spacecraft Attitude Control During Atmospheric Re-Entry

arXiv:2606.31291v1 Announce Type: new Abstract: Deep reinforcement learning has the potential to solve attitude control problems more adaptively, precisely, and robustly by handling nonlinear dynamics, uncertainties, and failure cases more effectively than traditional attitude…

11
arXiv — Machine Learning research 1d ago

Safe Online Learning via Smooth Safety-Structured Policy Composition

arXiv:2606.31320v1 Announce Type: new Abstract: Safe online reinforcement learning requires policies to respect safety constraints while maintaining smooth optimization dynamics. Existing approaches typically rely on either strict safety enforcement via action interventions,…

7
arXiv — Machine Learning research 1d ago

Expected Gain-based Escalation in Vertical Federated Learning

arXiv:2606.31331v1 Announce Type: new Abstract: Collaborative inference can improve predictive performance by integrating complementary information across agents, but applying collaborative fusion to every sample can incur unnecessary communication and computational overhead.…

17
arXiv — Machine Learning research 1d ago

Dualformer: Efficient Feature Extractor for Complex-valued Blind Communication Signal Analysis

arXiv:2606.31352v1 Announce Type: new Abstract: Designing effective feature extractors is critical for blind signal analysis tasks such as automatic modulation recognition (AMR), signal scheme recognition (SSR), and \color{black} signal structure parsing (SSP). In this work, we…

10
arXiv — NLP / Computation & Language research 1d ago

Calibrating the Evaluator: Does Probability Calibration Mitigate Preference Coupling in LLM Agent Feedback Loops?

arXiv:2606.31371v1 Announce Type: cross Abstract: When large language model (LLM) agents adapt their behavior through evaluator feedback, systematic evaluator biases propagate into the agent's learned strategy distribution - a phenomenon termed evaluator preference coupling.…

38
arXiv — Machine Learning research 1d ago

Resolving superposition in AI for interpretability and cross-modal alignment in patient-neuronal images

arXiv:2606.31394v1 Announce Type: new Abstract: Artificial intelligence is transforming our capability to solve biological challenges. In dimensionality bottleneck regimes exacerbated by high-dimensional biological data, Neural networks force distinct concepts into the lower…

14
arXiv — Machine Learning research 1d ago

Mixture-of-Control: State-Aware Fine-Tuning for Transformer-based Models

arXiv:2606.31397v1 Announce Type: new Abstract: State-based fine-tuning has emerged as a compelling alternative to weight-based adaptation for transformers, updating lightweight controls into states rather than model weights, offering substantial memory savings while retaining…

27
arXiv — Machine Learning research 1d ago

Contextual Slate GLM Bandits with Limited Adaptivity

arXiv:2606.31449v1 Announce Type: new Abstract: We investigate the contextual slate bandit problem with generalized linear rewards under limited adaptivity. At each round, the learner is presented with $N$ sets of items, where each item is represented by a $d$-dimensional…

22
arXiv — Machine Learning research 1d ago

Zero-Shot Quantization for Object Detectors using Off-the-Shelf Generative Models

arXiv:2606.31456v1 Announce Type: new Abstract: With an increasing number of Object Detection (OD) models being deployed on edge devices, Zero-Shot Quantization for OD (ZSQ-OD) aims to quantize these models when access to the original training data is prohibited. Existing…

34
arXiv — Machine Learning research 1d ago

TabPATE: Differentially Private Tabular In-Context Learning Without Public Data

arXiv:2606.31474v1 Announce Type: new Abstract: Tabular foundation models enable accurate in-context learning (ICL) from small labeled datasets, but the private records placed in context can leak through model predictions. We first show that even basic membership inference…

38
arXiv — Machine Learning research 1d ago

Constrained Online Convex Optimization without Slater's Condition

arXiv:2606.31480v1 Announce Type: new Abstract: We study constrained online convex optimization with adversarial losses and stochastic or adversarial constraints. For stochastic constraints, existing algorithms that achieve nearly optimal regret and constraint violation bounds…

27
arXiv — NLP / Computation & Language research 1d ago

Fork-Think with Confidence

arXiv:2606.31484v1 Announce Type: cross Abstract: Parallel thinking has enjoyed great success for boosting LLM performance on reasoning tasks without the need for any re-training. However, existing methods follow a think-first-then-decide paradigm, i.e., they first sample…

38
arXiv — NLP / Computation & Language research 1d ago

RaBitQCache: Rotated Binary Quantization for KVCache in Long Context LLM Inference

arXiv:2606.31519v1 Announce Type: cross Abstract: Long-context Large Language Model inference is severely bottlenecked by the massive Key-Value (KV) cache, yet existing sparse attention methods often suffer from static fixed-budget (Top-k) retrieval or rely on proxy scores that…

5
arXiv — Machine Learning research 1d ago

On the Convergence of Self-Improving Online LLM Alignment

arXiv:2606.31524v1 Announce Type: new Abstract: The Self-Improving Alignment (SAIL) algorithm addresses distribution shift by reducing a bilevel formulation of the problem to an efficient, single-level method. Empirically, SAIL has demonstrated strong performance on this task.…

8
arXiv — Machine Learning research 1d ago

Beyond the Expressivity-Trainability Paradox: A Dynamical Lie Algebra Perspective on Navigating Barren Plateaus in Quantum Machine Learning

arXiv:2606.31536v1 Announce Type: new Abstract: As Quantum Machine Learning (QML) transitions toward practical implementation, the field faces a critical architectural bottleneck that challenges the fundamental assumptions of classical statistical learning theory. In classical…

16
arXiv — Machine Learning research 1d ago

Introduction to Stochastic Differential Equations for Generative Machine Learning: A Variational Perspective

arXiv:2606.31576v1 Announce Type: new Abstract: The use of ordinary and stochastic differential equations has led to substantial progress in generative machine learning with applications to, for example, image, video and biomolecule generation. This paper provides a…

23
arXiv — Machine Learning research 1d ago

Robustness of neural networks to random noise perturbations of their inputs

arXiv:2606.31581v1 Announce Type: new Abstract: We investigate the problem of the robustness of a trained neural network to the perturbation of its input values. More specifically, we examine the interplay between the accuracy of the network, as measured by the mean squared…

11
arXiv — Machine Learning research 1d ago

Evil Spectra: How Optimisers can Amplify or Suppress Emergent Misalignment

arXiv:2606.31591v1 Announce Type: new Abstract: Emergent misalignment (EM) is a recently discovered phenomenon in LLMs where fine-tuning on a narrow misaligned task, such as writing insecure code, leads to broadly misaligned behaviour on unrelated prompts. Previous work has…

11
arXiv — Machine Learning research 1d ago

Calibration, Not Compilation: Detecting and Repairing Misspecified Probabilistic Programs Written by Language Models

arXiv:2606.31630v1 Announce Type: new Abstract: Language models increasingly write probabilistic programs (in NumPyro, Stan, or Pyro), but a program that compiles, runs, and passes every unit test can still be \emph{statistically} wrong -- a Gaussian likelihood for heavy-tailed…

4
arXiv — Machine Learning research 1d ago

ECHO: Prune to act, trace to learn with selective turn memory in agentic RL

arXiv:2606.31650v1 Announce Type: new Abstract: Long-horizon language agents must repeatedly interact with tools, accumulate evidence, and make decisions under bounded context windows. Existing context-management methods make such rollouts feasible by truncating distant history,…

12

A Transferable Learned Temporal Prior for Transmission Reconstruction and Decision-Relevant Uncertainty in Real Outbreak Labels

Behavior Cloning is Not All You Need: The Optimality of On-Policy Distillation for Noisy Expert Feedback

Personalizing Marketplace Policies with Competing Objectives and Constrained Experiments: Evidence from a Job Marketplace

Quality-Aware Modulation for Diffusion Transformers

Physics-informed Conditional Normalizing Flows for Angles-only Cislunar Orbit Determination

Multistage Defer Trees for Hybrid Interpretability: If at First You Can't Succeed, Tree Again

Estimating Supply Incrementality in Two-sided Marketplaces: A Causal Machine Learning Approach

Offline Reinforcement Learning for Fluid Controls: Data-based Multi-observational Policy Extraction

OTCache: Optimal Transport for Geometry-Aware Caching in Diffusion Models

Teaching LLMs to Recommend and Defer in Underrepresented Epilepsy Care

Warp RL: Reshaping Base Policy Distributions for Dynamics Adaptation

Knowledge Distillation from Large Reasoning Models to Compact Student Models: A Case Study on the John O Bryan Mathematics Competition

Fora: From Weight-Space to Function-Space Protection in Capability-Preserving Fine-Tuning

Explaining Machine Learning and Memorization with Statistical Mechanics

Visualizing High-Dimensional Graph Embeddings via Informed Multi-View Projections

Can Tabular In-Context Learners Generalize to Biomolecular Property Prediction?

A Bayesian Filtering Approach for Learning Lagrangian Dynamics from Noisy Measurements

PPT-Eval: A Benchmark for Computer-Use Agents on PowerPoint Tasks

ComplianceGate: Classifier-Gated Multi-Tier LLM Routing for Inference in Regulated Industries

AETDICE: Unified Framework and Offline Optimization for Nonlinear Multi-Objective RL

Transformers as Bayesian In-Context Experimenters: Smoothness-Adaptive Efficient ATE Estimation

ISM:Self-Improving Strategy Memory for Continual Mathematical Reasoning

Probing Memorization of Tabular In-Context Learning

Learning Gaussian Graphical Models from a Glauber Trajectory Without Mixing

TDGT: A Tabular Data Generation Toolkit supporting adaptive GPU-accelerated Bayesian mixture models, diffusion-based models, and latent-space generative modeling

The Calibration Turn in AI-Assisted Research: A Conceptual and Methodological Framework for Evidence-Licensed Claims

Revisiting the Volume Hypothesis

Sequential sparse Gaussian process quantile regression

Probabilistic Inversion with Flow Matching

Patch-PODiff-ViT: Structured Latent Diffusion with Patchwise POD for Super-Resolution and Uncertainty Quantification

Deep Reinforcement Learning for Spacecraft Attitude Control During Atmospheric Re-Entry

Safe Online Learning via Smooth Safety-Structured Policy Composition

Expected Gain-based Escalation in Vertical Federated Learning

Dualformer: Efficient Feature Extractor for Complex-valued Blind Communication Signal Analysis

Calibrating the Evaluator: Does Probability Calibration Mitigate Preference Coupling in LLM Agent Feedback Loops?

Resolving superposition in AI for interpretability and cross-modal alignment in patient-neuronal images

Mixture-of-Control: State-Aware Fine-Tuning for Transformer-based Models

Contextual Slate GLM Bandits with Limited Adaptivity

Zero-Shot Quantization for Object Detectors using Off-the-Shelf Generative Models

TabPATE: Differentially Private Tabular In-Context Learning Without Public Data

Constrained Online Convex Optimization without Slater's Condition

Fork-Think with Confidence

RaBitQCache: Rotated Binary Quantization for KVCache in Long Context LLM Inference

On the Convergence of Self-Improving Online LLM Alignment

Beyond the Expressivity-Trainability Paradox: A Dynamical Lie Algebra Perspective on Navigating Barren Plateaus in Quantum Machine Learning

Introduction to Stochastic Differential Equations for Generative Machine Learning: A Variational Perspective

Robustness of neural networks to random noise perturbations of their inputs

Evil Spectra: How Optimisers can Amplify or Suppress Emergent Misalignment

Calibration, Not Compilation: Detecting and Repairing Misspecified Probabilistic Programs Written by Language Models

ECHO: Prune to act, trace to learn with selective turn memory in agentic RL