Tag

Rag

500 articles archived under #rag · RSS

arXiv — NLP / Computation & Language research 24d ago

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

arXiv:2606.06740v1 Announce Type: cross Abstract: Discrete speech units obtained via k-means clustering of self supervised embeddings entangle phonetic, speaker, and language information, causing speaker mixing and cross-lingual interference in multilingual multi-speaker speech…

22
arXiv — NLP / Computation & Language research 24d ago

MADRAG: Multi-Agent Debate with Retrieval-Augmented Generation for Training-Free Analytic Essay Scoring

arXiv:2606.06754v1 Announce Type: cross Abstract: We present MADRAG, a training-free framework for analytic essay scoring that combines multi-agent reasoning with retrieval-augmented grounding. Unlike standard LLM-as-judge approaches, which are prone to bias and unstable…

10
arXiv — NLP / Computation & Language research 24d ago

HKVM-RAG: Key-Value-Separated Hypergraph Evidence Organization for Multi-Hop RAG

arXiv:2606.07218v1 Announce Type: cross Abstract: Multi-hop RAG poses a data-engineering problem beyond passage matching: under fixed retrieval budgets, a system must organize retrieved text into evidence units that expose answer chains. Dense retrievers score passages…

32
arXiv — NLP / Computation & Language research 24d ago

TEVI: Text-Conditioned Editing of Visual Representations via Sparse Autoencoders for Improved Vision-Language Alignment

arXiv:2606.07451v1 Announce Type: cross Abstract: Vision-language models such as CLIP are highly useful for diverse tasks due to their shared image-text embedding space. Despite this, the image and text embeddings are often poorly aligned, affecting downstream performance.…

6
arXiv — NLP / Computation & Language research 24d ago

CTR-Sink: Attention Sink for Language Models in Click-Through Rate Prediction

arXiv:2508.03668v3 Announce Type: replace Abstract: Click-Through Rate (CTR) prediction, a core task in recommendation systems, estimates user click likelihood using historical behavioral data. Modeling user behavior sequences as text to leverage Language Models (LMs) for this…

5
arXiv — NLP / Computation & Language research 24d ago

SWE-IF: Aligning Code Evaluation with Human Preference

arXiv:2510.07315v2 Announce Type: replace Abstract: Large Language Models (LLMs) have catalyzed vibe coding, where users leverage LLMs to generate and iteratively refine code through natural language interactions until it passes their vibe check. Vibe check reflects human…

14
arXiv — NLP / Computation & Language research 24d ago

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

arXiv:2601.06600v4 Announce Type: replace Abstract: Short-video platforms have become major channels for misinformation, where deceptive claims frequently leverage visual experiments and social cues. While Multimodal Large Language Models (MLLMs) have demonstrated impressive…

24
arXiv — NLP / Computation & Language research 24d ago

SEEK: Steering LLM Reasoning for RAG via Internal Reasoning Sketches

arXiv:2601.09402v2 Announce Type: replace Abstract: Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge into the generation process. Benefiting from the reasoning capabilities of LLMs, existing methods have leveraged…

8
Hugging Face Daily Papers research 24d ago

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Abstract Text embeddings from large language models are enhanced by EmbedFilter, a linear transformation that reduces the influence of high-frequency tokens and improves semantic representations while enabling dimensionality reduction. Generated by…

34
Hugging Face Daily Papers research 24d ago

Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

Abstract Socratic-SWE enables self-evolving software engineering agents by leveraging historical solving traces to generate targeted repair tasks that improve agent performance through iterative refinement. Generated by Qwen/Qwen2.5-Coder-32B-Instruct LLM-driven software…

21
r/LocalLLaMA community 24d ago

Qwen 3.6 27B on DeepSWE

Overview: It scored 2% (1.79% rounded up) It is 18/20th place scoring above Haiku 4.5 and Minimax M2.7 Full benchmark took 70 hours Average time per task 32m Average output tokens per task: 44k Perspectives: It scored suspiciously similar to 3.6 Plus and it really gets me…

21
r/LocalLLaMA community 25d ago

Alternatives to ChromaDB for easy RAG search

I'm disappointed that ChromaDB's local, free "single node" version is still getting second-class, hand-me-down features while the "distributed" version (a SaaS offering, unsurprisingly) gets built in hybrid search, BM25, etc. I tried to give the benefit of the doubt and wait,…

4
Hugging Face Daily Papers research 27d ago

BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

Abstract BRepCLIP enables multimodal representation learning for CAD models by aligning boundary representation geometry with language and image embeddings through contrastive pretraining, achieving superior retrieval and classification performance compared to point-based…

7
Hacker News — AI on Front Page community 27d ago

Harness engineering: Leveraging Codex in an agent-first world

Article URL: https://openai.com/index/harness-engineering/ Comments URL: https://news.ycombinator.com/item?id=48416264 Points: 221 # Comments: 137

16
Hugging Face Daily Papers research 27d ago

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

Abstract AffordanceVLA introduces a unified framework that uses structured affordance forecasting as an intermediate representation to improve the precision of perception-action mapping in robotic manipulation by leveraging vision-language models. Generated by…

4
Hacker News — AI on Front Page community 27d ago

Conventional Commits encourages focus on the wrong things

Article URL: https://sumnerevans.com/posts/software-engineering/stop-using-conventional-commits/ Comments URL: https://news.ycombinator.com/item?id=48414027 Points: 204 # Comments: 168

30
Hugging Face Daily Papers research 27d ago

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

Abstract AURA enhances query answering by incorporating an intent inference step that estimates implicit needs and optimizes tool usage through gap scoring, achieving better implicit-need coverage and reduced probe consumption compared to standard approaches. Generated by…

15
Hugging Face Daily Papers research 27d ago

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

Abstract Large language models show arithmetic fragility due to geometric structures in residual streams, where neural noise causes quantization failures that can be detected and corrected through geometric analysis. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Large Language…

6
Hugging Face Daily Papers research 27d ago

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

Abstract Financial AI agents struggle with user complexity, but a new architecture called InKH addresses this by embedding complexity into the system through structured knowledge management and temporal memory mechanisms. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Financial AI…

15
Hugging Face Daily Papers research 27d ago

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

Abstract Mechanical engineering drawing understanding is improved through a specialized dataset and domain-specific model that outperforms existing baselines by leveraging multi-stage training and high-density visual question answering annotations. Generated by…

9
arXiv — Machine Learning research 27d ago

The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models

arXiv:2606.05169v1 Announce Type: new Abstract: We give a stereological theory of LLM benchmark coverage. For any suite with effective dimensionality d_eff, the visible Hausdorff distance between two convex capability profiles consistent with the same scores is bounded by…

30
arXiv — Machine Learning research 27d ago

MolE-RAG: Molecular Structure-Enhanced Retrieval-Augmented Generation for Chemistry

arXiv:2606.05693v1 Announce Type: new Abstract: Large language models (LLMs) have shown promise for molecular property prediction, but their ability to reason over chemical structures remains limited, as molecular representations such as SMILES differ substantially from the…

16
arXiv — Machine Learning research 27d ago

Consistency Training Along the Transformer Stack

arXiv:2606.05817v1 Announce Type: new Abstract: Consistency training encourages models to behave similarly across different contexts, and has shown promise for reducing misalignment. We broaden the scope of consistency training in two ways. First, we introduce two new internal…

37
arXiv — Machine Learning research 27d ago

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

arXiv:2606.05988v1 Announce Type: new Abstract: Reasoning models produce long chain-of-thought traces that are costly to distill and encourage verbose student outputs. We study post-hoc compression of such traces before knowledge distillation. Two teachers, Qwen3.5-397B-A17B and…

30
arXiv — Machine Learning research 27d ago

Generative Criticality in Large Language Model Temperature Scaling

arXiv:2606.06238v1 Announce Type: new Abstract: We propose a statistical-field framework for text generated by large language models (LLMs), treating token embeddings as continuous spin variables on a one-dimensional chain. Defining a susceptibility from the connected two-point…

21
arXiv — NLP / Computation & Language research 27d ago

Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning

arXiv:2606.05173v1 Announce Type: new Abstract: Masked language modelling (MLM) has been the dominant pre-training objective for text encoders since BERT, yet it encourages representations that are strongly anchored to surface-form token identity rather than deeper semantic…

22
arXiv — NLP / Computation & Language research 27d ago

TensorBench: Benchmarking Coding Agents on a Compiler-Based Tensor Framework

arXiv:2606.05570v1 Announce Type: new Abstract: Repository-level coding benchmarks face a trade-off between task difficulty and evaluation reliability: tasks that challenge frontier models often involve large codebases with incomplete test coverage, while human review does not…

32
arXiv — NLP / Computation & Language research 27d ago

Narrative Knowledge Weaver: Narrative-Centric Retrieval-Augmented Reasoning for Long-Form Text Understanding

arXiv:2606.05724v1 Announce Type: new Abstract: Long-form narrative QA requires reasoning over evolving story worlds rather than isolated passages: answers may depend on earlier goals, changing character states, social relations, causal triggers, temporal position, and later…

24
arXiv — NLP / Computation & Language research 27d ago

ReverseEOL: Improving Training-free Text Embeddings via Text Reversal in Decoder-only LLMs

arXiv:2606.05858v1 Announce Type: new Abstract: Recent advances in Large Language Models (LLMs) have opened new avenues for generating training-free text embeddings. However, the causal attention in decoder-only LLMs prevents earlier tokens from attending to future context,…

35
arXiv — NLP / Computation & Language research 27d ago

Reducing Hallucinations in Complex Question Answering using Simple Graph-based Retrieval-Augmented Generation (long version)

arXiv:2606.05901v1 Announce Type: new Abstract: Large language models (LLMs) have fundamentally transformed the landscape of Natural Language Processing. Despite these advances, LLMs and LLM-based systems remain prone to a variety of failure modes. Retrieval-augmented generation…

37
arXiv — NLP / Computation & Language research 27d ago

IA-RAG: Interval-Algebra-Driven Temporal Reasoning for Dynamic Knowledge Retrieval

arXiv:2606.06044v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has shown strong effectiveness in grounding Large Language Models (LLMs) with external knowledge. However, existing RAG and Graph RAG frameworks largely treat knowledge as static or associate…

13
Hugging Face Daily Papers research 27d ago

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Abstract Reinforcement learning approach enables large language models to translate unseen languages by leveraging in-context linguistic knowledge rather than memorizing specific languages. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Prior work has shown that large language…

8
Vercel — AI dev-tools 27d ago

Drives for Vercel Sandbox in Private Beta

Vercel Sandbox now supports drives in private beta. Drives are persistent, attachable storage with a lifecycle independent from any sandbox. Create a drive once, then mount it at a configurable path when starting a sandbox. When the sandbox stops, the drive remains available to…

38
r/LocalLLaMA community 28d ago

You guys were right - Qwen 3.6 35B IS good...and KV Cache DOES matter.

WARNING: I'm speed typing this, no time to organizea/format, so if short paragraph chunks bother you, just keep it moving. When Qwen 3.6 35B dropped, a lot of people were heaping praises and I thought they were just glazing it because of the speed. 27B was objectionably smarter…

36
r/MachineLearning community 28d ago

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Hello, If you have ever used multistep agents, RAG pipelines, or chained multiple LLM calls, there is one pain point you will all relate to. When an agent gets stuck in an infinite loop, suddenly hallucinates on the third step, or is quietly burning through OpenAI API credits...…

20
The Information — AI news-outlet 28d ago

Billionaire Databricks and Perplexity Co-Founder Pitches AI Researchers to Not Work for Big Tech

The billionaire co-founder of Databricks and Perplexity AI , Andy Konwinski , is singularly focused on plugging the years-long drain of talent from academia to Big Tech. He wants to encourage academics to focus on publishing more openly available research, a reaction to the move…

18
r/LocalLLaMA community 28d ago

I Built a Practical Guide to LLM Engineering: RAG, Retrieval, Rerankers, and Evaluation

If you’re building LLM apps and feel confused about when to use keyword search, embeddings, rerankers, or vector databases, this repo is for that. I built a docs-first repo on practical LLM system design patterns, covering pre-filtering, hybrid retrieval, rerankers, in-memory…

23
llama.cpp releases dev-tools 28d ago

b9503

fix(mtmd): handle Gemma 4 audio projector embedding size ( #24091 ) mtmd: handle Gemma 4 audio projector embedding size rm projection_dim from clip_n_mmproj_embd Co-authored-by: Xuan Son Nguyen son@huggingface.co macOS/iOS: macOS Apple Silicon (arm64) macOS Apple Silicon (arm64,…

28
r/MachineLearning community 28d ago

Embedding space [D]

Hello everyone, I’m relatively new to this area of machine learning and currently experimenting with Variational Autoencoders (VAEs) to build an embedding space for an image dataset with images have different spatial dimensions, I cannot easily standardize them to a fixed size.…

11
arXiv — Machine Learning research 28d ago

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

arXiv:2606.04135v1 Announce Type: new Abstract: Time series forecasting relies on historical patterns, but real-world series often exhibit non-stationarity and regime shifts that challenge fully parametric forecasters. Inspired by Retrieval-Augmented Generation (RAG), recent…

8
arXiv — Machine Learning research 28d ago

When Autoregressive Consistency Hurts Safety Alignment

arXiv:2606.04168v1 Announce Type: new Abstract: Safety alignment in large language models (LLMs) is fragile in part because it is often shallow: fine-tuning mainly reshapes the model's behavior near the first few output tokens. We argue that this phenomenon can be understood…

21
arXiv — Machine Learning research 28d ago

Literature-Guided Minimax Optimization of Virtual Epilepsy Neurostimulation

arXiv:2606.04339v1 Announce Type: new Abstract: Computational models of epilepsy promise patient-specific treatment design, but most optimization workflows still search for parameters that perform well on average. In neuromodulation, this is a weak target: a protocol that…

15
arXiv — Machine Learning research 28d ago

Shortcomings and capacities of real-constrained neural networks in complex spaces

arXiv:2606.04390v1 Announce Type: new Abstract: We find the asymptotic ratio between the storage capacities when enforcing real pre-activations in a complex hypothesis class as opposed to complex ones in the same class. Our methods depend on Gardner volume comparisons at…

6
arXiv — Machine Learning research 28d ago

On Out-of-sample Embedding in UMAP

arXiv:2606.04451v1 Announce Type: new Abstract: Neighbor embedding algorithms reveal correlations in high-dimensional data by constructing an equivalent graph representation in a lower-dimensional space. An increasingly popular algorithm is Uniform Manifold Learning and…

34
arXiv — Machine Learning research 28d ago

Learning symplectic model reduction based on a approximation theorem of symplectic embeddings

arXiv:2606.04623v1 Announce Type: new Abstract: High-dimensional Hamiltonian systems play a central role in many scientific and engineering disciplines, with dynamics evolving on symplectic manifolds. Although deep learning provides powerful tools for constructing…

9
arXiv — Machine Learning research 28d ago

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

arXiv:2606.04665v1 Announce Type: new Abstract: Deep unsupervised domain adaptation (Deep UDA) methods successfully leverage rich labeled data in a source domain to boost the performance on related but unlabeled data in a target domain. However, algorithm comparison is…

23
arXiv — Machine Learning research 28d ago

Curvature-aware dynamic precision approach for physics-informed neural networks

arXiv:2606.04736v1 Announce Type: new Abstract: Physics-informed neural networks (PINNs) have become a promising framework for simulating partial differential equations (PDEs) by embedding physical laws directly into neural network training. However, recent studies show that…

24
arXiv — NLP / Computation & Language research 28d ago

When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG

arXiv:2606.04127v1 Announce Type: new Abstract: Medical question answering is a high-stakes setting where factual errors can have serious consequences. Retrieval-augmented generation (RAG) is widely viewed as a promising solution, and prior work has reported substantial gains…

35
arXiv — NLP / Computation & Language research 28d ago

MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A

arXiv:2606.04231v1 Announce Type: new Abstract: Recent advances in multimodal retrieval-augmented generation (MM-RAG) have shifted toward minimal parsing, relying on page-level images for producing retriever embeddings and for answer generation. While efficient, this trend often…

24
arXiv — NLP / Computation & Language research 28d ago

LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding

arXiv:2606.04302v1 Announce Type: new Abstract: Key-value (KV) caching accelerates inference of large language models (LLMs) by reusing past computations for generated tokens. Its importance becomes even greater in long-context applications such as retrieval-augmented generation…

10

Multilingual Multi-Speaker Unit Vocoders: A Systematic Analysis of Discrete Speech Representations

MADRAG: Multi-Agent Debate with Retrieval-Augmented Generation for Training-Free Analytic Essay Scoring

HKVM-RAG: Key-Value-Separated Hypergraph Evidence Organization for Multi-Hop RAG

TEVI: Text-Conditioned Editing of Visual Representations via Sparse Autoencoders for Improved Vision-Language Alignment

CTR-Sink: Attention Sink for Language Models in Click-Through Rate Prediction

SWE-IF: Aligning Code Evaluation with Human Preference

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

SEEK: Steering LLM Reasoning for RAG via Internal Reasoning Sketches

Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings

Socratic-SWE: Self-Evolving Coding Agents via Trace-Derived Agent Skills

Qwen 3.6 27B on DeepSWE

Alternatives to ChromaDB for easy RAG search

BRepCLIP: Contrastive Multimodal Pretraining on BRep Primitives for CAD Understanding

Harness engineering: Leveraging Codex in an agent-first world

AffordanceVLA: A Vision-Language-Action Model Empowering Action Generation through Affordance-Aware Understanding

Conventional Commits encourages focus on the wrong things

AURA: Intent-Directed Probing for Implicit-Need Surfacing in Situated LLM Agents

The Shape of Addition: Geometric Structures of Arithmetic in Large Language Models

Absorbing Complexity: An Interaction-Native Knowledge Harness for Financial LLM Agents

MechVQA: Benchmarking and Enhancing Multimodal LLMs on Comprehensive Mechanical Drawing Understanding

The Evaluation Blind Spot: A Stereological Theory of Benchmark Coverage for Large Language Models

MolE-RAG: Molecular Structure-Enhanced Retrieval-Augmented Generation for Chemistry

Consistency Training Along the Transformer Stack

Compress-Distill: Reasoning Trace Compression for Efficient Knowledge Distillation

Generative Criticality in Large Language Model Temperature Scaling

Predict and Reconstruct: Joint Objectives for Self-Supervised Language Representation Learning

TensorBench: Benchmarking Coding Agents on a Compiler-Based Tensor Framework

Narrative Knowledge Weaver: Narrative-Centric Retrieval-Augmented Reasoning for Long-Form Text Understanding

ReverseEOL: Improving Training-free Text Embeddings via Text Reversal in Decoder-only LLMs

Reducing Hallucinations in Complex Question Answering using Simple Graph-based Retrieval-Augmented Generation (long version)

IA-RAG: Interval-Algebra-Driven Temporal Reasoning for Dynamic Knowledge Retrieval

Reinforcement Learning Elicits Contextual Learning of Unseen Language Translation

Drives for Vercel Sandbox in Private Beta

You guys were right - Qwen 3.6 35B IS good...and KV Cache DOES matter.

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Billionaire Databricks and Perplexity Co-Founder Pitches AI Researchers to Not Work for Big Tech

I Built a Practical Guide to LLM Engineering: RAG, Retrieval, Rerankers, and Evaluation

b9503

Embedding space [D]

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

When Autoregressive Consistency Hurts Safety Alignment

Literature-Guided Minimax Optimization of Virtual Epilepsy Neurostimulation

Shortcomings and capacities of real-constrained neural networks in complex spaces

On Out-of-sample Embedding in UMAP

Learning symplectic model reduction based on a approximation theorem of symplectic embeddings

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

Curvature-aware dynamic precision approach for physics-informed neural networks

When Retrieval Doesn't Help: A Large-Scale Study of Biomedical RAG

MM-BizRAG: Rethinking Multimodal Retrieval-Augmented Generation for General Purpose Enterprise Q&A

LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding