Tag

Rag

500 articles archived under #rag · RSS

arXiv — NLP / Computation & Language research 2d ago

How to Leverage Synthetic Speech for LLM-Based ASR Systems?

arXiv:2606.29031v1 Announce Type: new Abstract: In regulated domains such as banking and healthcare, where privacy constraints make real speech costly to collect and retain, synthetic speech from modern text-to-speech (TTS) is an appealing alternative for training automatic…

15
arXiv — NLP / Computation & Language research 2d ago

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

arXiv:2606.29068v1 Announce Type: new Abstract: Text encoders are known for their utility in natural language processing, as they are able to efficiently compress inputs into dense vectors while preserving semantics. These models have been applied to affective computing, in…

19
arXiv — NLP / Computation & Language research 2d ago

AB-RAG: Adaptive Budgeted Retrieval-Augmented Generation for Reliable Question Answering

arXiv:2606.29090v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) has become the standard way to ground large language models in external knowledge, yet most systems retrieve a fixed number of passages for every question regardless of its difficulty. This…

11
arXiv — NLP / Computation & Language research 2d ago

MIThinker: A Plug-and-Play Policy-Optimized Thinker For Motivational Interviewing Counseling

arXiv:2606.29265v1 Announce Type: new Abstract: Reasoning large language models (LLMs) have recently made much progress in complex problem-solving, leveraging internal reasoning (or thought) to guide their solution generation. However, existing LLM-based counseling agents,…

17
arXiv — NLP / Computation & Language research 2d ago

TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs

arXiv:2606.29375v1 Announce Type: new Abstract: Medical large language models are commonly adapted with a fixed low-rank budget, even though medical questions differ substantially in confidence, clinical coverage, and cross-domain difficulty. We study adaptive rank budgeting for…

15
arXiv — NLP / Computation & Language research 2d ago

mamabench and mamaretrieval: Benchmarks for Evaluating Medical Retrieval-Augmented Generation in Maternal, Neonatal, and Reproductive Health

arXiv:2606.29467v1 Announce Type: new Abstract: Medical question-answering benchmarks rarely cover the maternal, neonatal, child, and reproductive-health questions a nurse-midwife asks, and, to our knowledge, no public chunk-level relevance benchmark exists for maternal-health…

25
arXiv — NLP / Computation & Language research 2d ago

Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM

arXiv:2606.29563v1 Announce Type: new Abstract: Large language models (LLMs) excel at complex tasks like question answering and summarization, thanks to their ability to handle long-context inputs. However, deploying LLMs is costly, not only due to the high computational demands…

7
arXiv — NLP / Computation & Language research 2d ago

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

arXiv:2606.29571v1 Announce Type: new Abstract: The standard way to compare two text embeddings is cosine similarity. Scattered studies report that a different metric does better, but never pin down the geometric condition that decides when, or why. We settle both with a…

10
arXiv — NLP / Computation & Language research 2d ago

MAM-AI: An On-Device Medical Retrieval-Augmented Generation System for Nurses and Midwives in Zanzibar

arXiv:2606.29580v1 Announce Type: new Abstract: Maternal and newborn mortality remain among the highest in sub-Saharan Africa, where midwifery care is often delivered by nurses who lack midwifery training to international standards, and consulting authoritative guidance at the…

7
arXiv — NLP / Computation & Language research 2d ago

Managing Map Cardinality in Automatic Disease Classification Mapping: Balancing Precision, Recall and Coverage

arXiv:2606.29750v1 Announce Type: new Abstract: Automatic mapping between disease classification systems, such as the International Classification of Diseases (ICD), is a challenging yet essential task for integrating health data and conducting longitudinal data analysis.…

32
arXiv — NLP / Computation & Language research 2d ago

MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation

arXiv:2606.29914v1 Announce Type: new Abstract: Agent memory systems are increasingly evaluated against RAG and full-context baselines, but reported gains often mix changes in the memory method with changes in the language model, embedding model, or retrieval pipeline, making it…

4
arXiv — NLP / Computation & Language research 2d ago

Parametric Skills

arXiv:2606.30015v1 Announce Type: new Abstract: Since intelligence fundamentally relies on efficient skill acquisition (Chollet, 2019), the ability to leverage skills is critical. For LLMs, skills, manually authored or extracted from task trajectories, are textual recipes…

16
arXiv — NLP / Computation & Language research 2d ago

Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

arXiv:2606.30093v1 Announce Type: new Abstract: Retrieval-Augmented Generation (RAG) mitigates hallucinations in Large Language Models (LLMs) by grounding the generation process on external knowledge. However, standard RAG approaches struggle with multi-hop reasoning. While…

10
arXiv — NLP / Computation & Language research 2d ago

Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts

arXiv:2606.30152v1 Announce Type: new Abstract: Contextual language models conflate grammatical gender and social semantic bias in gendered languages such as Spanish. Existing gender debiasing approaches only operate on static word embeddings leaving contextual representations…

26
arXiv — NLP / Computation & Language research 2d ago

Forewarned is Forearmed: When Non-Sequential Embedding Turns Into an Anomaly Detector

arXiv:2606.30196v1 Announce Type: new Abstract: This paper offers an in-depth analysis of non-sequential multimodal sentence-level embeddings, with a particular focus on the SONAR model. We demonstrate that certain embedding dimensions are sensitive to perturbations and can…

25
Hugging Face Daily Papers research 2d ago

GUICrafter: Weakly-Supervised GUI Agent Leveraging Massive Unannotated Screenshots

Abstract GUICrafter addresses GUI agent data challenges through a weakly-supervised approach using unannotated screenshots and a two-stage curriculum learning framework for visual grounding and reinforcement learning calibration. Generated by Qwen/Qwen2.5-Coder-32B-Instruct…

10
Vercel — AI dev-tools 2d ago

Expanded Audit Log coverage, now delivered through Vercel Drains

Audit Logs now capture 400+ unique team activity events , giving teams broader coverage for security reviews, compliance workflows, and investigations. With Vercel Drains support, teams can export those events to custom HTTP endpoints or Amazon S3, replacing Custom SIEM Log…

6
r/LocalLLaMA community 2d ago

What's the full local AI "doomsday prepper" kit for cold storage? 16-bit safetensors of LLMs (obv), copies/source codes of Llama.cpp, ComfyUI, vLLM, Kobold, LMStudio, etc, macOS, Linux OSes, Windows 10&11, etc, Rufus (including older ones), various VMs, P-E-W's Heretic/Grimoire,…

For those who want to be as paranoid and maximally doomsday prepped as possible, I am curious what the most thorough "doomsday kit" is of things to store offline copies of "just in case", to still be able to use local AI if things go truly crazy to a super extreme level. So far…

23
r/MachineLearning community 2d ago

I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]

(this was deleted before but i dont know if it was the filters of reddit or the moderators, if is the moderators i will not post it again after you delete it sorry.) (The name will probably change soon because I didn't realize "AgroVision" is already a registered trademark lol.)…

15
Hacker News — AI on Front Page community 2d ago

Pollen (CEO Negus-Fancey, CTO Wright) tried to remove article, and Google helped

Article URL: https://blog.pragmaticengineer.com/pollen-tried-to-remove-my-article-about-callum-negus-fancey-and-google-is-assisting-to-it/ Comments URL: https://news.ycombinator.com/item?id=48716902 Points: 264 # Comments: 32

15
r/MachineLearning community 3d ago

RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P]

What it does RAGless is a semantic retrieval system based on Question-to-Question matching. At ingestion, an LLM generates multiple question variants per answer (3–5) and each variant gets its own embedding. At query time, the user question is embedded, Top-K nearest question…

23
arXiv — Machine Learning research 3d ago

Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage

arXiv:2606.27515v1 Announce Type: new Abstract: Accurate prediction of bottom-hole pressure (BHP) and CO2 plume migration is essential for safe geological carbon storage, yet practical simulations often rely on truncated domains where artificial boundaries distort pressure…

32
arXiv — Machine Learning research 3d ago

TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding

arXiv:2606.27651v1 Announce Type: new Abstract: In recent years, with the emergence of Temporal Knowledge Graphs (TKGs), research on learning entity and relation representations in TKGs has attracted increasing attention, giving rise to a large number of TKG embedding methods.…

35
arXiv — Machine Learning research 3d ago

Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings

arXiv:2606.27672v1 Announce Type: new Abstract: Inspired by advances in natural language processing and computer vision, "time-series foundation models" (TSFMs) have recently been introduced with the promise of strong generalization across diverse time-series tasks, including…

5
arXiv — Machine Learning research 3d ago

Aurora: A Leverage-Aware Spectral Optimizer

arXiv:2606.27715v1 Announce Type: new Abstract: We show that for tall matrix parameters, like projection matrices in the MLP layers, the Muon update can have row norms that are arbitrarily non-uniform. This can lead to a self-reinforcing feedback loop whereby neurons receive…

13
arXiv — Machine Learning research 3d ago

Dangerous Liaisons of Convex Learning and Non-Affine Aggregation

arXiv:2606.28123v1 Announce Type: new Abstract: Last-iterate convergence and generalization guarantees in first-order convex learning hinge on the monotonicity of the update operator. While linear averaging preserves the monotonicity of gradient updates, this property is often…

17
arXiv — Machine Learning research 3d ago

On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models

arXiv:2606.27567v1 Announce Type: cross Abstract: Prompt injection is the top security risk for LLM-integrated applications, yet every defense proposed so far has been broken. We prove this is not a coincidence: in shared-embedding architectures that lack enforced control-data…

20
arXiv — NLP / Computation & Language research 3d ago

Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026

arXiv:2606.27446v1 Announce Type: new Abstract: This paper describes team HSA_CORAL's submission to the FinCausal 2026 shared task on extracting cause-effect relations from financial narratives via extractive question answering in English and Spanish. We compare three modeling…

4
arXiv — NLP / Computation & Language research 3d ago

Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling

arXiv:2606.27705v1 Announce Type: new Abstract: Large Language Models (LLMs) still struggle with the ``lost-in-the-middle'' problem, where critical information located in the middle of long-context inputs is often underrepresented or lost. While existing methods attempt to…

4
arXiv — NLP / Computation & Language research 3d ago

SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation

arXiv:2606.27786v1 Announce Type: new Abstract: Retrieval-augmented generation (RAG) enhances LLMs by incorporating external knowledge to support response generation. However, conflicts between retrieved context and parametric knowledge have emerged as a critical challenge in…

16
arXiv — NLP / Computation & Language research 3d ago

The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization

arXiv:2606.28013v1 Announce Type: new Abstract: Headline type-correctness (TC\%) of LLM autoformalization has climbed from $\sim$53\% to $\sim$76\% in two years, yet this scalar conceals which errors each method resolves. We propose a signal-coverage matrix that crosses the Lean…

23
arXiv — NLP / Computation & Language research 3d ago

MultiHashFormer: Hash-based Generative Language Models

arXiv:2606.28057v1 Announce Type: new Abstract: Language models (LMs) represent tokens using embedding matrices that scale linearly with the vocabulary size. To constrain the parameter footprint, prior work proposes hashing many tokens into a single vector within encoder-only…

15
arXiv — NLP / Computation & Language research 3d ago

HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech

arXiv:2606.28249v1 Announce Type: cross Abstract: Recently, Large Language Model (LLM)-based Text-to-Speech (TTS) models have achieved remarkable naturalness. However, the standard Supervised Fine-Tuning paradigm often converges to statistically averaged prosody, limiting…

20
r/LocalLLaMA community 3d ago

A lot of good M5 Max options available at Apple Refurbished

Just a heads-up. After Apple's price hike announcement, they added a bunch of top-of-the-line 14" M5 Pro/Max options to their refurbished website. If you got discouraged by the price hike, check out their refurbished store.   submitted by   /u/Hanthunius [link]  …

13
r/MachineLearning community 3d ago

I shrank a transformer until every number fitted on the screen and made the weights editable [R]

I've been teaching myself how LLMs actually work, not at the API level, but down to the matrix multiplications. To force myself to really understand the forward pass, I first built a complete transformer by hand in a spreadsheet from embeddings through to the loss. Then I turned…

31
r/MachineLearning community 4d ago

Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P]

When evaluating migrating production LLM workloads off commercial cloud APIs, the conversation usually gets oversimplified into a trade-off between quality and infrastructure cost. To look past clean, isolated averages, I built a repeatable evaluation matrix using a real-world…

29
Hugging Face Daily Papers research 4d ago

Fast LeWorldModel

Abstract Fast-LeWM accelerates visual planning by replacing autoregressive rollout with parallel action-prefix prediction, reducing computational costs and latency accumulation during long-horizon predictions. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Joint-Embedding…

20
TechCrunch — AI news-outlet 4d ago

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on

New models are launching in Asia that promise Mythos-like capabilities without fear of an export ban. U.S. AI labs may never recover this enormous market.

29
r/LocalLLaMA community 5d ago

What's one local AI workflow you wish you'd discovered sooner?

There are a lot of posts about the models and benchmarks, but I am more interested in the workflows that people use. What is one workflow that really saved you time or made your local LLM more useful? It could be anything—RAG, MCP, coding agents, organizing prompt, document…

23
Hugging Face Daily Papers research 6d ago

Hallucination in World Models is Predictable and Preventable

Abstract World models exhibit hallucinations in low-data regions of state-action space, which can be detected and mitigated using data-centric signals and coverage-aware sampling techniques. Generated by Qwen/Qwen2.5-Coder-32B-Instruct Modern generative world models render…

25
arXiv — Machine Learning research 6d ago

Fast LeWorldModel

arXiv:2606.26217v1 Announce Type: new Abstract: Joint-Embedding Predictive Architectures (JEPAs), including recent LeWorldModel (LeWM), have become a promising foundation for reconstruction-free visual world models. For visual planning, however, LeWM evaluates candidate action…

32
arXiv — Machine Learning research 6d ago

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

arXiv:2606.26432v1 Announce Type: new Abstract: Tabular foundation models achieve strong accuracy on choice prediction tasks, but their predictions often violate the economic logic those tasks require: raising a price can increase predicted demand, implied willingness-to-pay…

36
arXiv — Machine Learning research 6d ago

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

arXiv:2606.26666v1 Announce Type: new Abstract: Autoregressive large language model (LLM) serving is increasingly limited by key-value (KV) cache movement rather than dense matrix multiplication. Modern paged-attention systems reduce KV-cache fragmentation and mature kernels…

20
arXiv — Machine Learning research 6d ago

A Generalization Theory for JEPA-Based World Models

arXiv:2606.27014v1 Announce Type: new Abstract: Joint Embedding Predictive Architectures (JEPAs) have recently emerged as a promising paradigm for world modeling by learning predictive dynamics in a latent space rather than generating future observations at the input level.…

5
arXiv — Machine Learning research 6d ago

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

arXiv:2606.27114v1 Announce Type: new Abstract: Uplift modeling, crucial for estimating individual treatment effects (ITE), faces dual challenges: flexibly leveraging inter-group similarity to enhance discriminative power and debiasing under unobserved confounding scenarios. In…

19
arXiv — Machine Learning research 6d ago

BetXplain: An Explanation-Annotated Dataset for Detecting Manipulative Betting Advertisements on Social Media

arXiv:2606.27274v1 Announce Type: new Abstract: The promotion of betting applications on social media platforms has increased significantly in recent years. Many of these advertisements use persuasive techniques that may mislead users, encourage risky behavior, and potentially…

37
arXiv — NLP / Computation & Language research 6d ago

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

arXiv:2606.26382v1 Announce Type: new Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic…

35
arXiv — NLP / Computation & Language research 6d ago

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

arXiv:2606.26449v1 Announce Type: new Abstract: Retrieval-augmented systems routinely present citations alongside generated answers, yet a citation does not confirm that the corresponding source meaningfully shaped the output. This paper introduces ProvenAI, a framework that…

17
arXiv — NLP / Computation & Language research 6d ago

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

arXiv:2606.26487v1 Announce Type: new Abstract: Large language models (LLMs) are attractive for context-aware time series forecasting because they can integrate heterogeneous textual signals, yet their discrete, language-oriented tokenization and embedding interfaces are…

21
arXiv — NLP / Computation & Language research 6d ago

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

arXiv:2606.26489v1 Announce Type: new Abstract: News media play a central role in shaping public perceptions of climate change, and whether coverage emphasizes threats or solutions has measurable effects on audience engagement and policy support. Automated detection of these…

23

How to Leverage Synthetic Speech for LLM-Based ASR Systems?

A Comparative Study on Affective Cues in Text Embeddings Across Psychological Emotion Theories

AB-RAG: Adaptive Budgeted Retrieval-Augmented Generation for Reliable Question Answering

MIThinker: A Plug-and-Play Policy-Optimized Thinker For Motivational Interviewing Counseling

TriageRA-CCF: Source-Side Clinical Confidence and Coverage Signals for Adaptive Rank Budgeting in Medical LLMs

mamabench and mamaretrieval: Benchmarks for Evaluating Medical Retrieval-Augmented Generation in Maternal, Neonatal, and Reproductive Health

Coverage-Driven KV Cache Eviction for Efficient and Improved Inference of LLM

Anisotropy Decides Cosine vs. Rank Metrics for Text Embeddings

MAM-AI: An On-Device Medical Retrieval-Augmented Generation System for Nurses and Midwives in Zanzibar

Managing Map Cardinality in Automatic Disease Classification Mapping: Balancing Precision, Recall and Coverage

MemDelta: Controlled Baselines and Hidden Confounds in Agent Memory Evaluation

Parametric Skills

Efficient Retrieval-Augmented Generation via Token Co-occurrence Graphs

Estimating Grammatical Gender Directions in Contextual Embeddings under Controlled and Natural Contexts

Forewarned is Forearmed: When Non-Sequential Embedding Turns Into an Anomaly Detector

GUICrafter: Weakly-Supervised GUI Agent Leveraging Massive Unannotated Screenshots

Expanded Audit Log coverage, now delivered through Vercel Drains

What's the full local AI "doomsday prepper" kit for cold storage? 16-bit safetensors of LLMs (obv), copies/source codes of Llama.cpp, ComfyUI, vLLM, Kobold, LMStudio, etc, macOS, Linux OSes, Windows 10&11, etc, Rufus (including older ones), various VMs, P-E-W's Heretic/Grimoire,…

I built a demo agricultural planning system with an AI advisor for small-scale farmers in Nicaragua using NASA data [p]

Pollen (CEO Negus-Fancey, CTO Wright) tried to remove article, and Google helped

RAGless: Q-Q retrieval with score aggregation for closed-domain FAQ [P]

Boundary condition fidelity for bottom-hole pressure and CO2 plume prediction in geological carbon storage

TeRoR: Decoupled Temporal Rotation with Relational Circular Region for Temporal Knowledge Graph Embedding

Are Time-Series Foundation Models Ready for E-Nose Data? An Empirical Assessment of Their Embeddings

Aurora: A Leverage-Aware Spectral Optimizer

Dangerous Liaisons of Convex Learning and Non-Affine Aggregation

On the Inseparability of Instructions and Data in Shared-Embedding Sequence Models

Causal Connections: Leveraging Multilingual Fine-Tuning for Financial QA@FinCausal 2026

Mitigating Position Bias in Transformers via Layer-Specific Positional Embedding Scaling

SHIFT: Gate-Modulated Activation Steering for Knowledge Conflict Mitigation in Retrieval-Augmented Generation

The Signal-Coverage Matrix: Stratifying Type and Semantic Errors in Statement Autoformalization

MultiHashFormer: Hash-based Generative Language Models

HPRO: Hierarchical Progressive Reward Optimization via Preference Extraction for Emotional Text-to-Speech

A lot of good M5 Max options available at Apple Refurbished

I shrank a transformer until every number fitted on the screen and made the weights editable [R]

Benchmarking Self-Hosted Gemma 2 9B vs. Frontier APIs: The FP8 Quantization Prefill Tax and VRAM Realities on an NVIDIA L4 [P]

Fast LeWorldModel

Asian AI startups launch Mythos-like models as Anthropic&#8217;s export ban drags on

What's one local AI workflow you wish you'd discovered sooner?

Hallucination in World Models is Predictable and Preventable

Fast LeWorldModel

Embedding Foundation Model Predictions in Discrete-Choice Models with Structural Guarantees

PersistentKV: Page-Aware Decode Scheduling for Long-Context LLM Serving on Commodity GPUs

A Generalization Theory for JEPA-Based World Models

Cross-Head Attention Uplift Network with Inverse Propensity Score under Unobserved Confounding

BetXplain: An Explanation-Annotated Dataset for Detecting Manipulative Betting Advertisements on Social Media

Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models

ProvenAI: Provenance-Native Traces of Evidence in Generated Answers

Speaking Numbers to LLMs: Multi-Wavelet Number Embeddings for Time Series Forecasting

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

Asian AI startups launch Mythos-like models as Anthropic’s export ban drags on