Tag

Research papers

500 articles archived under #paper · RSS

arXiv — Machine Learning research 1d ago

Improving Certified Robustness via Adversarial Distillation

arXiv:2606.31653v1 Announce Type: new Abstract: Certified training aims to produce models whose predictions can be formally verified against adversarial perturbations, typically by optimising upper bounds on the worst-case loss over an allowed perturbation set. For neural…

28
arXiv — Machine Learning research 1d ago

When to Truncate a Feature Ranking: A Residual-Overlap Stopping Rule for Subset Selection

arXiv:2606.31686v1 Announce Type: new Abstract: Feature rankings are widely used in supervised feature selection because they are simple, scalable and easy to interpret. Variables are first ranked by a relevance score, and a subset is then obtained by retaining the top-ranked…

17
arXiv — Machine Learning research 1d ago

Diffusing Blame: Task-Dependent Credit Assignment in Biologically Plausible Dual-Stream Networks

arXiv:2606.31700v1 Announce Type: new Abstract: Biological neural circuits obey Dale's principle: each neuron's synapses are uniformly excitatory or inhibitory. Artificial networks that respect this constraint must coordinate separate excitatory and inhibitory populations,…

28
arXiv — Machine Learning research 1d ago

Nonlinearity-Aware LoRA: Structured Gate Adaptation under Low-Rank Constraints

arXiv:2606.31717v1 Announce Type: new Abstract: Low-rank adaptation (LoRA) is commonly viewed as an update-space approximation to full fine-tuning, yet this view is incomplete for self-gated Transformer feed-forward networks. In gated FFNs, a low-rank residual can change not…

13
arXiv — Machine Learning research 1d ago

FedXDS: Leveraging Model Attribution Methods to counteract Data Heterogeneity in Federated Learning

arXiv:2606.31742v1 Announce Type: new Abstract: Explainable AI (XAI) methods have demonstrated significant success in recent years at identifying relevant features in input data that drive deep learning model decisions, enhancing interpretability for users. However, the…

4
arXiv — Machine Learning research 1d ago

Addressing Over-Refusal in LLMs with Competing Rewards

arXiv:2606.31748v1 Announce Type: new Abstract: Safety training on language models often induces over-refusal: improved safety on harmful prompts at the cost of increased refusal on harmless ones. Though this trade-off can be mitigated by training models with reinforcement…

25
arXiv — Machine Learning research 1d ago

Policy Optimization Achieves Data-Dependent Regret Bounds in MDPs with Unknown Transitions

arXiv:2606.31769v1 Announce Type: new Abstract: We study policy optimization for online episodic tabular Markov decision processes with unknown transition kernels, aiming for best-of-both-worlds guarantees together with data-dependent regret bounds. Recent work (Dann et al.,…

8
arXiv — NLP / Computation & Language research 1d ago

Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers

arXiv:2606.31779v1 Announce Type: cross Abstract: Language models typically reason via explicit chain-of-thought (CoT), generating intermediate steps token-by-token. Latent CoT offers an alternative: it performs multi-step reasoning in the model's hidden states, replacing…

34
arXiv — Machine Learning research 1d ago

Relational and Sequential Conformal Inference for Energy Time Series over Graphs via Foundation Models

arXiv:2606.31804v1 Announce Type: new Abstract: Accurate energy demand forecasting is essential for the reliable operation and planning of modern sustainable energy systems. Spatial-temporal graph neural networks (STGNNs) have recently achieved strong performance in point…

7
arXiv — Machine Learning research 1d ago

Geometry-Preserving Orthonormal Initialization for Low-Rank Adaptation in RLVR

arXiv:2606.31813v1 Announce Type: new Abstract: Low-rank adaptation (LoRA) and its variants enable parameter-efficient fine-tuning of large language models under the supervised fine-tuning (SFT) paradigm. However, their efficacy and behavior under Reinforcement learning with…

24
arXiv — Machine Learning research 1d ago

Low-dimensional topology of deep neural networks

arXiv:2606.31856v1 Announce Type: new Abstract: We study layered models, including feedforward networks, ResNets, and transformers, by limiting each layer to a width of $d = 3$, i.e., $\mathbb{R}^3$ as representation space. This allows us to track how a neural network changes…

17
arXiv — NLP / Computation & Language research 1d ago

Review Residuals: Update-Conditioned Residual Gating for Transformers

arXiv:2606.31859v1 Announce Type: cross Abstract: Residual connections add every sublayer's proposed update with a fixed coefficient of one; the network never evaluates whether an update is reliable before committing it. Drawing on the human-factors principle of independent…

23
arXiv — Machine Learning research 1d ago

Sequential RC-TGAN: Generating Relational Time Series with Spectral Envelope Loss

arXiv:2606.31904v1 Announce Type: new Abstract: The generation of synthetic relational databases often involves modeling complex temporal dynamics, such as transaction logs or event sequences. A significant challenge in this domain is the handling of categorical time series…

26
arXiv — Machine Learning research 1d ago

Interface-Aware Neural Newton Preconditioning for Robust Cohesive Zone Model Simulations

arXiv:2606.31921v1 Announce Type: new Abstract: Cohesive Zone Models (CZMs) are widely used to simulate interface fracture, delamination, adhesive failure, and fiber--matrix debonding in aerospace composite structures. In implicit quasi-static finite element analyses, cohesive…

27
arXiv — Machine Learning research 1d ago

Making Sense of Touch from the Child's View for Contrastive Learning

arXiv:2606.31943v1 Announce Type: new Abstract: Is the sense of touch a mechanism for human babies' learning of visual concepts? If so, can we quantify its importance, and to what extent do babies rely on their sense of touch for visual learning? To approach these questions in a…

35
arXiv — NLP / Computation & Language research 1d ago

Signed-Permutation Coordinate Transport for RMSNorm Transformers

arXiv:2606.31963v1 Announce Type: cross Abstract: Modern LLM workflows move coordinate-indexed objects across checkpoints: steering vectors, sparse autoencoders, top-$k$ neuron sets, attribution lists, and merge alignments. This is only well posed after fixing the model's…

37
arXiv — Machine Learning research 1d ago

Amplifying Membership Signal Through Chained Regeneration

arXiv:2606.31991v1 Announce Type: new Abstract: The tendency of large generative models to memorize training data makes sample verification critical for privacy auditing and copyright enforcement. Current membership (MIA) and dataset inference (DI) attacks often rely on one-shot…

7
arXiv — Machine Learning research 1d ago

Radial Suppression Accelerates Algorithmic Generalization: A Geometric Analysis of Delayed Generalization

arXiv:2606.32000v1 Announce Type: new Abstract: Why do neural networks memorize algorithmic training data long before they generalize? We present a geometric case study demonstrating that, on tasks where generalization requires discovering structured low-dimensional circuits,…

17
arXiv — Machine Learning research 1d ago

Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?

arXiv:2606.32008v1 Announce Type: new Abstract: Mechanistic interpretability (MI) requires full access to model internals, yet the APIs for most widely deployed language models at best expose log-probabilities over output tokens. This creates a surrogate problem: when do…

28
arXiv — Machine Learning research 1d ago

CoMet: Context and Multiplicity Decomposition for Multimodal Uncertainty Estimation

arXiv:2606.32012v1 Announce Type: new Abstract: Uncertainty estimation has been a long-standing challenge in AI models; it amounts to "knowing what you don't know," and metacognition is notoriously difficult even for humans (cf. the Dunning-Kruger effect). Although it is still…

8
arXiv — Machine Learning research 1d ago

FedLAB: Traceable Semantic Codebooks for Federated Multimodal Graph Foundation Learning

arXiv:2606.32016v1 Announce Type: new Abstract: Multimodal graph foundation models aim to learn reusable knowledge from graphs enriched with text, images, attributes, and relational topology, thereby supporting diverse graph-centric and modality-centric tasks. In practice,…

14
arXiv — Machine Learning research 1d ago

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

arXiv:2606.32017v1 Announce Type: new Abstract: Agentic reinforcement learning requires assigning credit to environment-facing actions such as searches, clicks, edits, navigation commands, and object interactions. Standard GRPO uses the final verifier outcome as a uniform…

13
arXiv — NLP / Computation & Language research 1d ago

SemRF: A Semantic Reference Frame for Residual-Stream Dynamics in Language Models

arXiv:2606.32022v1 Announce Type: cross Abstract: Residual-stream analysis asks how language-model computation evolves across depth, but intermediate decoding requires comparable readout coordinates across layers. If embedding anchors and unembedding readout disagree on the…

23
arXiv — Machine Learning research 1d ago

AdaJEPA: An Adaptive Latent World Model

arXiv:2606.32026v1 Announce Type: new Abstract: Latent world models enable planning from high-dimensional observations by predicting future states in a compact latent space. However, these models are typically kept frozen at test time: when their predictions become inaccurate,…

24
arXiv — NLP / Computation & Language research 1d ago

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

arXiv:2606.32034v1 Announce Type: cross Abstract: LLM agents increasingly act over long horizons, where a single trajectory can contain hundreds or thousands of actions. In these settings, outcome-only rewards provide too sparse guidance, failing to inform the model about the…

36
arXiv — Machine Learning research 1d ago

Seven-dimensional Trajectory Reconstruction for VAMOS++

arXiv:2503.18959v1 Announce Type: cross Abstract: The VAMOS++ magnetic spectrometer is characterized by a large angular and momentum acceptance and highly non-linear ion optics properties requiring the use of software ion trajectory reconstruction methods to measure the ion…

31
arXiv — Machine Learning research 1d ago

Analysis of Atomic Charge State and Atomic Number for VAMOS++ Magnetic Spectrometer using Deep Neural Networks and Fractionally Labelled Events

arXiv:2507.07109v2 Announce Type: cross Abstract: The VAMOS++ magnetic spectrometer is a multi-parametric system that integrates ion optical magnetic elements with a multi-detector stack. The magnetic elements, along with the tracking and timing detectors and the trajectory…

9
arXiv — Machine Learning research 1d ago

MediEncoder: Nonlinear Representation Learning for High-Dimensional Causal Mediation Analysis

arXiv:2606.30648v1 Announce Type: cross Abstract: Causal mediation analysis decomposes a treatment effect into indirect pathways through mediators and direct pathways not operating through them. Modern biomedical studies often involve high-dimensional covariates and mediators…

10
arXiv — Machine Learning research 1d ago

Can Physician Expertise Improve Machine Learning Identification of Delirium?

arXiv:2606.30651v1 Announce Type: cross Abstract: Delirium is common in hospitalized patients and is often missed in routine care. We present a user-centered interactive machine learning (UC-iML) framework for delirium detection support that combines physician-guided feature…

30
arXiv — Machine Learning research 1d ago

Estimating the Effect of Timing on Coupon Effectiveness

arXiv:2606.30664v1 Announce Type: cross Abstract: The coupon incentive is one of the most common tools marketers use to court users to engage with a business at various stages of the customer life cycle. A variety of factors can affect the effectiveness of a coupon incentive on…

7
arXiv — Machine Learning research 1d ago

Explainable Artificial Intelligence For The Detection and Characterisation of Stage B Heart Failure

arXiv:2606.30665v1 Announce Type: cross Abstract: Stage B heart failure is characterized by asymptomatic structural or functional cardiac abnormalities. Identifying individuals at this stage is clinically important, as early detection may enable targeted interventions to prevent…

20
arXiv — Machine Learning research 1d ago

Listening Between the Lines: Joint Learning of ASR Embeddings and LLM-Augmented Linguistics for Dementia Detection

arXiv:2606.30675v1 Announce Type: cross Abstract: Early detection of dementia through speech analysis offers a non-invasive screening alternative, but capturing both acoustic and linguistic biomarkers remains challenging. We propose a multimodal framework leveraging Whisper for…

28
arXiv — Machine Learning research 1d ago

Criticality-Constrained Iterative Pruning for Energy-Efficient Spiking Neural Networks via Combined Importance Scoring

arXiv:2606.30676v1 Announce Type: cross Abstract: Deploying spiking neural networks (SNNs) on neuromorphic hardware demands aggressive synaptic pruning while preserving temporal computation integrity. Existing strategies either neglect neuronal criticality or rely on convex…

5
arXiv — Machine Learning research 1d ago

Locker-based Truck-Drone Routing with Integrated Considerations of Pickups, Deliveries, and No-Fly Zones

arXiv:2606.30680v1 Announce Type: cross Abstract: Truck-drone delivery is an emerging last-mile logistics mode combining the long-haul capacity of trucks with the flexible service capability of drones. In locker-based operations, smart lockers serve not only as temporary parcel…

22
arXiv — Machine Learning research 1d ago

A Coherence Law for Trainability in Noisy Equivariant Quantum Neural Networks

arXiv:2606.30688v1 Announce Type: cross Abstract: Symmetry provides a quantum neural network structure, but on its own it does not keep the network trainable once noise is present. We ask which physical quantity decides whether the gradients of an equivariant circuit survive…

22
arXiv — NLP / Computation & Language research 1d ago

ViTL: Temporal Logic-Guided Zero-Shot Natural Language Navigation via Vision-Language Models

arXiv:2606.30696v1 Announce Type: cross Abstract: Enabling robots to follow natural language commands to complete zero-shot long-horizon tasks remains challenging. It requires extracting implicit temporal and logical constraints from natural language commands and executing…

4
arXiv — Machine Learning research 1d ago

BEST-RQ-2: Contextualize-Then-Predict, a Two-Step Approach for Self-Supervised Audio Representations

arXiv:2606.30700v1 Announce Type: cross Abstract: Self-supervised learning enables audio representations that transfer across domains and tasks. We present BEST-RQ-2, an evolution of BEST-RQ that retains frozen randomprojection-based discrete targets while introducing a two-step…

19
arXiv — Machine Learning research 1d ago

Diffusion-warm sampling of the XY model enables fast thermalization at scale

arXiv:2606.30773v1 Announce Type: cross Abstract: We introduce a novel technique for scalable sampling of spin-system states with continuous symmetries using diffusion models. By applying our approach to the XY model, a fundamental continuous-spin model in condensed matter…

7
arXiv — NLP / Computation & Language research 1d ago

A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization

arXiv:2606.30775v1 Announce Type: new Abstract: Enterprise AI agents route user queries to specialized skills by matching queries against natural language skill descriptions. When two skills share overlapping descriptions, the routing LLM misroutes queries, a failure we term…

25
arXiv — NLP / Computation & Language research 1d ago

Indi-RomCoM: Code-Mixed Benchmark for Evaluating LLMs on Romanized Indic-English Instructions

arXiv:2606.30790v1 Announce Type: new Abstract: Romanized Code Mixing (RCM), where bilingual speakers fluidly blend local languages with English in Roman script, has emerged as the dominant form of communication across multilingual communities. While Large Language Models (LLMs)…

26
arXiv — NLP / Computation & Language research 1d ago

Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

arXiv:2606.30801v1 Announce Type: new Abstract: Personalization algorithms determine what content users encounter on online platforms. Auditing these systems is difficult because independent auditors have only black-box access to the algorithms, while personalization depends on…

37
arXiv — NLP / Computation & Language research 1d ago

When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs

arXiv:2606.30814v1 Announce Type: new Abstract: Calibration evaluates whether a model confidence aligns with its empirical accuracy. Existing studies often compare the calibration of different large language models using global calibration metrics such as Expected Calibration…

21
arXiv — NLP / Computation & Language research 1d ago

When transformers learn "impossible" languages, what do they learn?

arXiv:2606.30815v1 Announce Type: new Abstract: Recent work suggests that transformer language models show a bias towards human languages over unnatural ("impossible") languages argued to be unacquirable by humans. However, this literature has largely based these claims on…

34
arXiv — NLP / Computation & Language research 1d ago

Test-Time Verification for Text-to-SQL via Outcome Reward Models

arXiv:2606.30851v1 Announce Type: new Abstract: Improving the reliability of large language models (LLMs) at inference time is a central challenge in structured reasoning tasks such as Text-to-SQL. Common test-time inference strategies, including Best-of-N sampling and Majority…

15
arXiv — NLP / Computation & Language research 1d ago

Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning

arXiv:2606.30857v1 Announce Type: new Abstract: This paper describes our submission to SemEval-2026 Task 9 on detecting multilingual, multicultural, and multievent online polarization. We address all three subtasks: binary polarization detection, polarization type…

4
arXiv — NLP / Computation & Language research 1d ago

Training Therapeutic Judges and Multi-Agent Systems for Human-Aligned Mental Health Support

arXiv:2606.30887v1 Announce Type: new Abstract: Large language models show promise for mental health support, yet therapeutic quality improves only when evaluation functions as an actionable control signal rather than a passive metric. We introduce a framework that formulates…

32
arXiv — NLP / Computation & Language research 1d ago

Beyond Clean Text: Evaluating Encoder and Decoder Robustness for Bangla Event Detection in Noisy Text

arXiv:2606.30914v1 Announce Type: new Abstract: Event detection (ED) systems are typically evaluated on clean, curated text, leaving their robustness to real-world noise largely unexplored, particularly for low-resource languages such as Bangla. We introduce a generalized Bangla…

17
arXiv — NLP / Computation & Language research 1d ago

Bridging Scientific Heritage: An Arabic--Russian Parallel Corpus and LLM Benchmark for Sustainable Knowledge Transfer

arXiv:2606.30943v1 Announce Type: new Abstract: Russian and Arabic are among the major languages of scientific communication. Language barriers impede the exchange of research results between these communities, which affects international collaboration and the progress of…

8
arXiv — NLP / Computation & Language research 1d ago

Linguistic Distancing on Social Media: Indicators of Emotion Regulation Across Age Groups

arXiv:2606.30957v1 Announce Type: new Abstract: Managing our emotional responses to events is key to emotional well-being, a process referred to as emotion regulation in psychology. Previous work has established that the degree to which we distance events is a type of emotion…

8
arXiv — NLP / Computation & Language research 1d ago

From Propositional to Perceptual Asymmetry: Extending Frictive Policy Optimization to Asymmetric Partial Information Dialogue

arXiv:2606.30973v1 Announce Type: new Abstract: Frictive Policy Optimization (FPO; Pustejovsky et al., 2025) treats friction in collaborative dialogue -- misalignment, misunderstanding, repair -- as an epistemic signal essential to common-ground construction, rather than noise…

18

Improving Certified Robustness via Adversarial Distillation

When to Truncate a Feature Ranking: A Residual-Overlap Stopping Rule for Subset Selection

Diffusing Blame: Task-Dependent Credit Assignment in Biologically Plausible Dual-Stream Networks

Nonlinearity-Aware LoRA: Structured Gate Adaptation under Low-Rank Constraints

FedXDS: Leveraging Model Attribution Methods to counteract Data Heterogeneity in Federated Learning

Addressing Over-Refusal in LLMs with Competing Rewards

Policy Optimization Achieves Data-Dependent Regret Bounds in MDPs with Unknown Transitions

Bridging the Gap Between Latent and Explicit Reasoning with Looped Transformers

Relational and Sequential Conformal Inference for Energy Time Series over Graphs via Foundation Models

Geometry-Preserving Orthonormal Initialization for Low-Rank Adaptation in RLVR

Low-dimensional topology of deep neural networks

Review Residuals: Update-Conditioned Residual Gating for Transformers

Sequential RC-TGAN: Generating Relational Time Series with Spectral Envelope Loss

Interface-Aware Neural Newton Preconditioning for Robust Cohesive Zone Model Simulations

Making Sense of Touch from the Child's View for Contrastive Learning

Signed-Permutation Coordinate Transport for RMSNorm Transformers

Amplifying Membership Signal Through Chained Regeneration

Radial Suppression Accelerates Algorithmic Generalization: A Geometric Analysis of Delayed Generalization

Surrogate Fidelity: When Can Open LLMs Explain Closed Ones?

CoMet: Context and Multiplicity Decomposition for Multimodal Uncertainty Estimation

FedLAB: Traceable Semantic Codebooks for Federated Multimodal Graph Foundation Learning

TRIAGE: Role-Typed Credit Assignment for Agentic Reinforcement Learning

SemRF: A Semantic Reference Frame for Residual-Stream Dynamics in Language Models

AdaJEPA: An Adaptive Latent World Model

QVal: Cheaply Evaluating Dense Supervision Signals for Long-Horizon LLM Agents

Seven-dimensional Trajectory Reconstruction for VAMOS++

Analysis of Atomic Charge State and Atomic Number for VAMOS++ Magnetic Spectrometer using Deep Neural Networks and Fractionally Labelled Events

MediEncoder: Nonlinear Representation Learning for High-Dimensional Causal Mediation Analysis

Can Physician Expertise Improve Machine Learning Identification of Delirium?

Estimating the Effect of Timing on Coupon Effectiveness

Explainable Artificial Intelligence For The Detection and Characterisation of Stage B Heart Failure

Listening Between the Lines: Joint Learning of ASR Embeddings and LLM-Augmented Linguistics for Dementia Detection

Criticality-Constrained Iterative Pruning for Energy-Efficient Spiking Neural Networks via Combined Importance Scoring

Locker-based Truck-Drone Routing with Integrated Considerations of Pickups, Deliveries, and No-Fly Zones

A Coherence Law for Trainability in Noisy Equivariant Quantum Neural Networks

ViTL: Temporal Logic-Guided Zero-Shot Natural Language Navigation via Vision-Language Models

BEST-RQ-2: Contextualize-Then-Predict, a Two-Step Approach for Self-Supervised Audio Representations

Diffusion-warm sampling of the XY model enables fast thermalization at scale

A Single Rewrite Suffices: Empirical Lessons from Production Skill Description Optimization

Indi-RomCoM: Code-Mixed Benchmark for Evaluating LLMs on Romanized Indic-English Instructions

Using AI Agents to Automate Black-Box Audits of Personalization Algorithms at Scale

When Calibration Rankings Reverse: Accuracy-Controlled Evaluation for Fair Comparison of LLMs

When transformers learn "impossible" languages, what do they learn?

Test-Time Verification for Text-to-SQL via Outcome Reward Models

Multilingual Polarization Detection Using Transformer-Based Models with Class Weighting and Threshold Tuning

Training Therapeutic Judges and Multi-Agent Systems for Human-Aligned Mental Health Support

Beyond Clean Text: Evaluating Encoder and Decoder Robustness for Bangla Event Detection in Noisy Text

Bridging Scientific Heritage: An Arabic--Russian Parallel Corpus and LLM Benchmark for Sustainable Knowledge Transfer

Linguistic Distancing on Social Media: Indicators of Emotion Regulation Across Age Groups

From Propositional to Perceptual Asymmetry: Extending Frictive Policy Optimization to Asymmetric Partial Information Dialogue