Tag

Training

450 articles archived under #training · RSS

arXiv — NLP / Computation & Language research 1mo ago

TRACE: Discovering Task-Specific Parameter via Adaptation-Aware Probing for Continual Fine-Tuning

arXiv:2605.31025v1 Announce Type: new Abstract: In real-world deployment, LLMs are often adapted continually across tasks to keep LLMs up-to-date in production, where new fine-tuning should preserve previously learned skills. However, indiscriminately mixing tasks can dilute…

27
arXiv — NLP / Computation & Language research 1mo ago

Towards Efficient LLMs Annealing with Principled Sample Selection

arXiv:2605.31175v1 Announce Type: new Abstract: The annealing phase is a pivotal convergence stage in LLM pre-training that ultimately determines final model quality. However, effectively selecting training data during this phase remains a key challenge. Current strategies rely…

8
arXiv — NLP / Computation & Language research 1mo ago

Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards

arXiv:2605.31328v1 Announce Type: new Abstract: Emergent misalignment (EM) is the surprising tendency of language models to become broadly misaligned after fine-tuning on narrowly misaligned examples. While EM has been extensively studied in the supervised fine-tuning (SFT)…

20
r/LocalLLaMA community 1mo ago

when you spend 5 days fine-tuning a model and it still confidently makes things up

  submitted by   /u/Chapper_App [link]   [comments]

37
Simon Willison community 1mo ago

datasette 1.0a32

Release: datasette 1.0a32 A minor bugfix release. Fixes a bug with INSERT ... RETURNING queries via the new /db/-/execute-write endpoint and a bunch of base_url issues which showed up when I was experimenting with Service Workers yesterday. Tags: datasette ,…

10
Hugging Face Daily Papers research 1mo ago

Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning

Abstract Training Vision-Language Models with geometric priors improves 3D spatial reasoning through deep supervision with contrastive loss and depth consistency, achieving better performance than standard fine-tuning approaches. AI-generated summary Vision-Language Models…

25
r/LocalLLaMA community 1mo ago

Mutating Gemma 4 31B Dense in to a native Gemma 4 additive-MoE model

I recently came across an interesting model on Hugginface from JDONE-Research/AIOne-Agent-52B-A36B-it . It is the first finetune I saw that is built on the Gemma 4 31B dense model but enables MoE for it, training a router + experts and enabling the enable_moe_block config like…

10
Hugging Face Daily Papers research 1mo ago

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Abstract DynaFLIP is a dynamics-aware multimodal pre-training framework that enhances robot manipulation by integrating motion understanding into visual perception through image-language-3D flow triplets and geometric regularization techniques. AI-generated summary Robot…

22
Hugging Face Daily Papers research 1mo ago

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases

Abstract Reinforcement Learning from Human Feedback (RLHF) presents alignment tampering vulnerabilities where language models can manipulate preference datasets, leading to amplified undesired behaviors due to limitations in pairwise comparisons and reward modeling. AI-generated…

17
r/MachineLearning community 1mo ago

Making LLMs tell you how confident they really are through probe-targeted fine tuning.[R]

Just wanted to share my research regarding probe-targeted fine-tuning (LoRa) for verbal confidence calibration., If you probe the hidden states of an instruct-tuned LLM, it can tell correct from incorrect answers at 0.76–0.88 AUROC. But when you ask it directly it tends to…

16
r/LocalLLaMA community 1mo ago

Liquid AI releases LFM2.5-8B-A1B

Liquid AI released LFM2.5-8B-A1B, an edge model designed to power real-life applications. It builds on LFM2-8B-A1B with three major upgrades: an expanded 128K context window, 38T tokens of pre-training (up from 12T), and large-scale reinforcement learning. It also comes with a…

14
arXiv — Machine Learning research 1mo ago

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

arXiv:2605.28860v1 Announce Type: new Abstract: Fine-tuning large language models (LLMs) frequently induces catastrophic forgetting of prior capabilities. Recent work has shown that reinforcement learning (RL) retains prior capabilities more effectively than supervised…

12
arXiv — Machine Learning research 1mo ago

Feature Geometry of LoRA Adapters: A Sparse Autoencoder Analysis of Representational Divergence in Fine-Tuned Language Models

arXiv:2605.28896v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has emerged as a widely adopted approach for adapting large language models, yet the internal representational changes induced by LoRA fine-tuning remain insufficiently understood. In this work, we…

31
arXiv — Machine Learning research 1mo ago

On-Policy Replay for Continual Supervised Fine-Tuning

arXiv:2605.29495v1 Announce Type: new Abstract: Continual supervised fine-tuning (SFT) is the de facto recipe for adapting large language models (LLMs) to a stream of downstream tasks, but it suffers from catastrophic forgetting of earlier capabilities. Recent work shows that…

20
arXiv — Machine Learning research 1mo ago

On the Construction and Implications of Low-Loss Valleys in LoRA-based Bayesian Inference

arXiv:2605.29580v1 Announce Type: new Abstract: While parameter-efficient fine-tuning methods like low-rank adaptation (LoRA) are standard for large language models, principled estimation of epistemic uncertainty remains challenging. Recent results in the LoRA regime suggest…

38
arXiv — NLP / Computation & Language research 1mo ago

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

arXiv:2605.29076v1 Announce Type: new Abstract: LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete…

21
arXiv — NLP / Computation & Language research 1mo ago

FoRA: Fisher-orthogonal Rank Adaptation for Parameter-Efficient Fine-Tuning

arXiv:2605.29317v1 Announce Type: new Abstract: Parameter-efficient fine-tuning(PEFT) has largely focused on LoRA and its accuracy-oriented variants, leaving the original goal of reducing trainable parameters has receivedcomparatively little attention. We introduce FoRA, which…

4
arXiv — NLP / Computation & Language research 1mo ago

Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting

arXiv:2605.29498v1 Announce Type: new Abstract: Low-Rank Adaptation (LoRA) has become one of the most widely used fine-tuning mechanisms for adapting large language models to new domains, tasks, and users. Yet adaptation performance alone can obscure an important failure mode:…

27
arXiv — NLP / Computation & Language research 1mo ago

Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation

arXiv:2605.29502v1 Announce Type: new Abstract: Low-resource target-language generation is often limited by scarce parallel data, while high-resource source-language monolingual data is abundant but difficult to use with standard supervised fine-tuning. We propose…

37
Hugging Face Daily Papers research 1mo ago

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

Abstract A comprehensive framework is presented for converting bidirectional video diffusion models into real-time interactive world models with controllable, causal, and low-latency capabilities through fine-tuning and distillation techniques. AI-generated summary Recent video…

8
Ars Technica — AI news-outlet 1mo ago

LLMs believe false statements even after explicit warnings that they're false

Fine-tuning tests show "bias ... toward confidently representing the claims as true."

16
r/LocalLLaMA community 1mo ago

LiquidAI/LFM2.5-8B-A1B · Hugging Face

looks like you can run it on any potato (A1B)! https://huggingface.co/LiquidAI/LFM2.5-8B-A1B-GGUF from LiquidAI: LFM2.5 is a new family of hybrid models designed for on-device deployment. It builds on the LFM2 architecture with extended pre-training and reinforcement learning.…

22
r/LocalLLaMA community 1mo ago

losing my mind fine-tuning jina-v5 for a legal corpus

For the last month i've been trying to fine-tune jina-v5 (which has performed best on my corpus out of the box) on slovak law chunks, time and time again no matter what i do I can't get the model to learn nuance of slovak syntax. here's the biggest trap chunk that keeps…

5
r/LocalLLaMA community 1mo ago

HF models page now has a "Base only" toggle to filter out finetunes/quants/etc

a feature that was requested a lot: https://huggingface.co/models?base_model_relation=base   submitted by   /u/paf1138 [link]   [comments]

5
arXiv — Machine Learning research 1mo ago

Architecture-driven Shift: towards a lightweight selector for capturing the trends of logit shift

arXiv:2605.27469v1 Announce Type: new Abstract: Continual Learning (CL) is a practical paradigm to utilize power of deep pre-trained neural networks, but which pre-trained model has a better ability to balance ``Plasticity-Stability", deserving to be chosen? The logit shift…

35
arXiv — Machine Learning research 1mo ago

Gradient Transformer: Learning to Generate Updates for LLMs

arXiv:2605.27591v1 Announce Type: new Abstract: Many organizations lack computational resources to fine-tune large language models (LLMs) on private (unshareable) data for better utility, while fine-tuning tiny language models (TinyLMs) alone performs poorly. To address this…

15
arXiv — Machine Learning research 1mo ago

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning

arXiv:2605.27765v1 Announce Type: new Abstract: Self-Distillation Policy Optimization (SDPO) provides dense token-level credit assignment for reinforcement learning with large language models by leveraging the model's own feedback-conditioned predictions as a self-teacher.…

34
arXiv — Machine Learning research 1mo ago

Fine-Tuning Dynamics of In-Context Factual Recall in Transformers

arXiv:2605.27774v1 Announce Type: new Abstract: In-context learning \ -- performing tasks based on examples given in the prompt \ -- is an important capability that has emerged in large language models and has received significant attention in both theory and practice. Existing…

29
arXiv — Machine Learning research 1mo ago

Density-aware Sample-specific Attack

arXiv:2605.27809v1 Announce Type: new Abstract: Despite recent progress in backdoor attacks, existing methods remain susceptible to post-training defenses that erase the backdoor through fine-tuning or pruning. We revisit the core objectives of backdoor attacks and derive…

29
arXiv — Machine Learning research 1mo ago

CAREF: Calibration-Aware Regularization for Explanation Faithfulness Without Rationale Supervision

arXiv:2605.27835v1 Announce Type: new Abstract: We introduce CAREF, a parameter-efficient fine-tuning framework that jointly optimizes predictive accuracy and explanation faithfulness via calibration-aware regularization. At its core, CAREF couples entropy-based calibration with…

34
arXiv — Machine Learning research 1mo ago

Continual Learning in Modern Hopfield Networks with an Application to Diffusion Models

arXiv:2605.27975v1 Announce Type: new Abstract: Generative models, including diffusion models, are increasingly used as foundation models and adapted through sequential fine-tuning, making continual learning an essential problem setting. However, continual learning in such…

33
arXiv — Machine Learning research 1mo ago

SPARD: Defending Harmful Fine-Tuning Attack via Safety Projection with Relevance-Diversity Data Selection

arXiv:2605.28030v1 Announce Type: new Abstract: Fine-tuning large language models often undermines their safety alignment, a problem further amplified by harmful fine-tuning attacks in which adversarial data removes safeguards and induces unsafe behaviors. We propose SPARD, a…

22
arXiv — NLP / Computation & Language research 1mo ago

From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

arXiv:2605.27387v1 Announce Type: new Abstract: Diffusion models promise efficient parallel text generation but rely on bidirectional attention, creating a structural mismatch with pre-trained Autoregressive (AR) models. This incompatibility precludes reusing robust AR priors,…

4
arXiv — NLP / Computation & Language research 1mo ago

The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates

arXiv:2605.28020v1 Announce Type: new Abstract: With the rapid progress of large language models (LLMs), reliably evaluating the capabilities of pre-trained LLMs has become increasingly important. The challenge is that base pre-trained models are optimized for next-token…

29
arXiv — NLP / Computation & Language research 1mo ago

Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models

arXiv:2605.28306v1 Announce Type: new Abstract: Mixture-of-Experts (MoE) models have emerged as a dominant paradigm for efficient LLM scaling, yet adapting them to non-English downstream tasks remains challenging. Existing fine-tuning approaches treat MoE models as monolithic…

37
r/LocalLLaMA community 1mo ago

Gemma-4-Harmonia-31B-Uncensored-Heretic Is Out Now, a Merge of Multiple gemma-4-31B-it Finetunes Designed for a Targeted Approach to Deep Neural Consolidation, Minimizing Regression While Amplifying Unique Capability Boundaries. With KLD 0.0047 and 9/100 Refusals!

Provided in both Safetensors and GGUFs. Safetensors, llmfan46/Gemma-4-Harmonia-31B-it-uncensored-heretic: https://huggingface.co/llmfan46/Gemma-4-Harmonia-31B-uncensored-heretic GGUFs, llmfan46/Gemma-4-Harmonia-31B-it-uncensored-heretic-GGUF:…

14
r/MachineLearning community 1mo ago

UK GDPR Small Business Q&A — 5,000 synthetic pairs with article-level citations [D]

Dataset for fine-tuning compliance assistants. Each pair includes: - A practical SME-facing question ("Can I use pre-ticked consent boxes?") - An answer with specific UK GDPR article references, ICO guidance by name, and actionable steps - Source metadata: which GDPR concepts…

23
r/LocalLLaMA community 1mo ago

I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful.

Posted this to r/MachineLearning a couple weeks ago (30K views, 100+ upvotes) and have been meaning to share it here where the fine-tuning angle is more directly relevant. I spent years building and processing a complete Usenet corpus from 1980 to 2013. Here’s why it might…

37
r/LocalLLaMA community 1mo ago

ReAligned-Qwen3.5 Release

New from Lazarus AI and Eric Hartford, creator of Dolphin and Samantha, announcing the release of the ReAligned-Qwen3.5 series of models. Apache 2.0 license, finetuned to reduce Chinese ideological bias and censorship, refusal behavior, and state-narrative framing. I use SFT +…

19
Hugging Face Daily Papers research 1mo ago

NSF-SciFy: Mining the NSF Awards Database for Scientific Claims

Abstract NSF-SciFy is a large-scale dataset of scientific claims and investigation proposals extracted from NSF award abstracts, enabling improved language model fine-tuning for claim verification and scientific discovery tracking. AI-generated summary We introduce NSF-SciFy, a…

22
Hugging Face Daily Papers research 1mo ago

Understanding Data Temporality Impact on Large Language Models Pre-training

Abstract Pre-training large language models on temporally ordered data improves their factual freshness and temporal precision compared to standard shuffled pre-training while maintaining general language understanding capabilities. AI-generated summary Large language models…

4
arXiv — Machine Learning research 1mo ago

GEM: Geometric Entropy Mixing for Optimal LLM Data Curation

arXiv:2605.26121v1 Announce Type: new Abstract: LLM pre-training efficacy increasingly depends on data composition rather than sheer volume. Yet, optimal mixing is hindered by categorization flaws: human taxonomies suffer from ontological misalignment, and Euclidean clustering…

27
arXiv — Machine Learning research 1mo ago

GAC: Noise-Aware Adaptive Mixing for Hybrid SFT-RL Post-Training

arXiv:2605.26184v1 Announce Type: new Abstract: Hybrid post-training usually combines supervised fine-tuning and reinforcement learning, but fixed mixing schedules cannot adapt when the relative noise of the two signals changes over time. We propose GAC, a noise-aware controller…

24
arXiv — Machine Learning research 1mo ago

Curriculum Learning for Safety Alignment

arXiv:2605.26315v1 Announce Type: new Abstract: Direct Preference Optimisation (DPO) is widely used for safety alignment in large language models. However, prior work shows it is brittle and exhibits poor out-of-distribution (OOD) generalisation. In this paper, we investigate…

20
arXiv — Machine Learning research 1mo ago

Aperiodic and Low-Frequency Spectral Bias in Reconstruction based EEG Foundation Models

arXiv:2605.26434v1 Announce Type: new Abstract: EEG foundation models, pre-trained on large-scale unlabelled EEG data, have emerged as a promising direction towards learning generalizable EEG representations. Despite showing positive results in data-rich regimes, they often fail…

23
arXiv — Machine Learning research 1mo ago

Extra-Merge: Tracing the Rank-1 Subspace of Model Merging in Language Model Pre-Training

arXiv:2605.26484v1 Announce Type: new Abstract: Model merging has emerged as a lightweight paradigm for enhancing Large Language Models (LLMs), yet its underlying mechanisms remain poorly understood. In this work, we analyze late-stage pre-training trajectories and uncover a…

16
arXiv — Machine Learning research 1mo ago

The Stability of Singular Distribution: A Spectral Perspective on the Two-Phase Dynamics of Language Model Pre-training

arXiv:2605.26489v1 Announce Type: new Abstract: Large language model pre-training typically exhibits a two-phase trajectory: a fast initial loss drop followed by a prolonged slow improvement. We identify an underlying spectral phenomenon, Stability of Singular Distribution…

11
arXiv — Machine Learning research 1mo ago

Beyond Pairwise Preferences: Listwise Reward-Aware Alignment for Diffusion Models

arXiv:2605.26491v1 Announce Type: new Abstract: Preference optimization has emerged as an efficient alternative to online reinforcement learning from human feedback (RLHF) for aligning text-to-image diffusion models. However, existing methods largely reduce supervision to binary…

10
arXiv — Machine Learning research 1mo ago

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

arXiv:2605.26526v1 Announce Type: new Abstract: Recent defenses for safeguarding open-weight large language models (LLMs) are intended to prevent adversarial usage. Underlying these defenses is an assumption that new harmful behavior is learned through fine-tuning rather than…

33
arXiv — NLP / Computation & Language research 1mo ago

Learning to Adapt SFT Data for Better Reasoning Generalization

arXiv:2605.26924v1 Announce Type: new Abstract: Large language models (LLMs) have achieved remarkable progress, with post-training playing a crucial role in enhancing their reasoning capabilities. Among post-training paradigms, supervised fine-tuning (SFT) is widely used: it…

8

TRACE: Discovering Task-Specific Parameter via Adaptation-Aware Probing for Continual Fine-Tuning

Towards Efficient LLMs Annealing with Principled Sample Selection

Reinforcement Learning Amplifies Emergent Misalignment from Harmless Rewards

when you spend 5 days fine-tuning a model and it still confidently makes things up

datasette 1.0a32

Beyond 3D VQAs: Injecting 3D Spatial Priors into Vision-Language Models for Enhanced Geometric Reasoning

Mutating Gemma 4 31B Dense in to a native Gemma 4 additive-MoE model

DynaFLIP: Rethinking Robotics Perception via Tri-Modal-Dynamics Guided Representation

Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases

Making LLMs tell you how confident they really are through probe-targeted fine tuning.[R]

Liquid AI releases LFM2.5-8B-A1B

Mechanistic origins of catastrophic forgetting: why RL preserves circuits better than SFT?

Feature Geometry of LoRA Adapters: A Sparse Autoencoder Analysis of Representational Divergence in Fine-Tuned Language Models

On-Policy Replay for Continual Supervised Fine-Tuning

On the Construction and Implications of Low-Loss Valleys in LoRA-based Bayesian Inference

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

FoRA: Fisher-orthogonal Rank Adaptation for Parameter-Efficient Fine-Tuning

Mask the Target: A Plug-and-Play Regularizer Against LoRA Forgetting

Source-Grounded Semantic Reinforcement Learning for Low-Resource Target-Language Generation

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

LLMs believe false statements even after explicit warnings that they&#039;re false

LiquidAI/LFM2.5-8B-A1B · Hugging Face

losing my mind fine-tuning jina-v5 for a legal corpus

HF models page now has a "Base only" toggle to filter out finetunes/quants/etc

Architecture-driven Shift: towards a lightweight selector for capturing the trends of logit shift

Gradient Transformer: Learning to Generate Updates for LLMs

Restoring the Sweet Spot: Pass-Rate Weighted Self-Distillation for LLM Reasoning

Fine-Tuning Dynamics of In-Context Factual Recall in Transformers

Density-aware Sample-specific Attack

CAREF: Calibration-Aware Regularization for Explanation Faithfulness Without Rationale Supervision

Continual Learning in Modern Hopfield Networks with an Application to Diffusion Models

SPARD: Defending Harmful Fine-Tuning Attack via Safety Projection with Relevance-Diversity Data Selection

From AR to Diffusion: Efficiently Adapting Large Language Models with Strictly Causal and Elastic Horizons

The Missing Piece in Pre-trained Model Evaluation: Reward-Guided Decoding Unlocks Task-Oriented Behavior Without Parameter Updates

Routing-Aligned Fine-Tuning for Multilingual Downstream Tasks in Mixture-of-Experts Models

Gemma-4-Harmonia-31B-Uncensored-Heretic Is Out Now, a Merge of Multiple gemma-4-31B-it Finetunes Designed for a Targeted Approach to Deep Neural Consolidation, Minimizing Regression While Amplifying Unique Capability Boundaries. With KLD 0.0047 and 9/100 Refusals!

UK GDPR Small Business Q&A — 5,000 synthetic pairs with article-level citations [D]

I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful.

ReAligned-Qwen3.5 Release

NSF-SciFy: Mining the NSF Awards Database for Scientific Claims

Understanding Data Temporality Impact on Large Language Models Pre-training

GEM: Geometric Entropy Mixing for Optimal LLM Data Curation

GAC: Noise-Aware Adaptive Mixing for Hybrid SFT-RL Post-Training

Curriculum Learning for Safety Alignment

Aperiodic and Low-Frequency Spectral Bias in Reconstruction based EEG Foundation Models

Extra-Merge: Tracing the Rank-1 Subspace of Model Merging in Language Model Pre-Training

The Stability of Singular Distribution: A Spectral Perspective on the Two-Phase Dynamics of Language Model Pre-training

Beyond Pairwise Preferences: Listwise Reward-Aware Alignment for Diffusion Models

Open-Weight LLM Fine-Tuning Defenses are Susceptible to Simple Attacks

Learning to Adapt SFT Data for Better Reasoning Generalization

LLMs believe false statements even after explicit warnings that they're false