r/MachineLearning

500 articles archived · Visit source ↗ · RSS

r/MachineLearning community 27d ago

Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? [d]

Hello everyone, Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? I am working on a project idea related to library-specific code generation. The concrete case is a specific Python library used in a…

18
r/MachineLearning community 27d ago

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

Hey everyone, The ARC Prize 2026 just launched the interactive ARC-AGI-3 track, and the collective AI world is panic-renting massive H100 clusters trying to get multi-billion parameter LLMs to navigate these dynamic environments. Predictably, out-of-the-box LLMs are faceplanting…

31
r/MachineLearning community 27d ago

[R] Measuring the Symmetry--Data Exchange Rate

The prediction that equivariance reduces sample complexity by a factor of |G| appears in roughly every paper on geometric deep learning and is measured as an actual scaling law in roughly none of them. This paper does the measurement. The methodology is the interesting part.…

9
r/MachineLearning community 28d ago

How do ML researchers actually use AI tools to improve their writing? [D]

As an ML researcher, how do you use AI tools in your daily work? Do you mostly use them to clean up grammar and wording, or also to rewrite, structure, or draft technical text?   submitted by   /u/Hope999991 [link]   [comments]

5
r/MachineLearning community 28d ago

We built a source-available LLM reliability library (free for research / personal / internal eval) that can cut inference cost by half at matched quality, and you adopt it by changing one import [P] [R]

TL;DR: Reliability techniques (methods that boost an LLM's correctness by spending extra inference, e.g., retries with feedback, ensembling, generator/critic refinement, verification passes, difficulty-aware routing) are scattered across the literature, each in its own…

10
r/MachineLearning community 28d ago

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Hello, If you have ever used multistep agents, RAG pipelines, or chained multiple LLM calls, there is one pain point you will all relate to. When an agent gets stuck in an infinite loop, suddenly hallucinates on the third step, or is quietly burning through OpenAI API credits...…

20
r/MachineLearning community 28d ago

Faithful uncertainty in LLM agents: calibration vs utility tradeoff in practice[D]

The Google paper on metacognition for hallucination reduction makes a distinction that is underappreciated in benchmarks. Calibration is not about being right more often. It is about matching confidence to correctness. A perfectly calibrated model can still be wrong twenty five…

26
r/MachineLearning community 28d ago

KVarN: Variance-Normalized KV-Cache Quantization [R]

Excited to share some of my own work here :) KVarN is our new KV-Cache quantization method. In very brief, we combine Hadamard rotations with variance-normalization on both axes of the K and V matrices, then round to nearest. Simple, but works very well, especially for…

21
r/MachineLearning community 28d ago

On-policy distillation: one of the hottest terms on PapersWithCode [R]

Hi, Niels here from the open-source team at Hugging Face. At paperswithcode.co I am trying to make it easier for people to learn about the newest techniques used across AI papers. One of the hottest terms in AI research that I've recently added is On-policy distillation , also…

27
r/MachineLearning community 28d ago

ICML financial aid [D]

Hello I am curious about the election criteria for ICML financial aid. If anyone have been granted financial aid would you mind sharing your profile. Somehow being a black woman ( 2 underrepresented groups) with one paper accepted at the main conference and two papers accepted…

7
r/MachineLearning community 28d ago

How Do You Handle Ablation Studies When the Original Model Is Already Trained?[R]

I'm running into an issue with an ablation study for a paper I'm preparing. I trained a model. The model achieved my best result, and I saved the trained checkpoint ( .pth file). Now my supervisor wants me to perform an ablation study by removing components and how it impacts…

29
r/MachineLearning community 28d ago

Embedding space [D]

Hello everyone, I’m relatively new to this area of machine learning and currently experimenting with Variational Autoencoders (VAEs) to build an embedding space for an image dataset with images have different spatial dimensions, I cannot easily standardize them to a fixed size.…

11
r/MachineLearning community 28d ago

Repo for implementations of various Transformer Attn mechanisms [P]

Initially, I developed this so I can easily switch between different Attention mechanisms for my Small Language Model (SLM) experiments and benchmarking. However, I also realized that these implementations can be applicable in Computer Vision, modernize Vision Encoders, RL, and…

14
r/MachineLearning community 28d ago

Research in Image/Video Gen AI models [D]

I've been going down a rabbit hole with image/video generation/editing models for a few months now, started with playing around with Stable Diffusion and ComfyUI, then got genuinely hooked on understanding why things work, not just that they do. I have an Engineering background…

20
r/MachineLearning community 28d ago

In current ML systems, where is the main bottleneck: dataset quality or model architecture improvements? [D]

A lot of recent progress in ML appears to come from scaling existing architectures rather than introducing fundamentally new ones. At the same time, there’s increasing emphasis on dataset quality, curation, and synthetic data pipelines. In practice, I’m trying to understand how…

13
r/MachineLearning community 28d ago

Best Visual Reasoning Model in 2026 (Including APIs) [D]

For example, suppose I have a one-hour video and I provide it to ChatGPT or another AI model. If I ask complex reasoning questions about the video, which models are best suited for long-horizon video understanding and reasoning? Which models can produce the most reliable answers…

38
r/MachineLearning community 28d ago

I have done a ML Project as a Novice [P]

Hi there! I am going to complete my MSc in Business Analytics and planning to do some real-life projects to attract the recruiters. I am sharing one of such projects here: FIFA World Cup 2026 Prediction: https://amit-world-cup-2026-simulator.streamlit.app/ Project Overview Large…

5
r/MachineLearning community 28d ago

Has anyone heard back from citadel ICML travel grant ? [D]

It’s confusing because they said applicants will be notified on 3rd June but also said you’ll be notified 2-4 weeks after the deadline (29th may)   submitted by   /u/Smol_pp001 [link]   [comments]

6
r/MachineLearning community 28d ago

First paper acceptance (ICML Workshop), should I attend? [D]

I just finished my first year of undergrad, and I got my first first-author paper accepted to an ICML workshop! Super stoked, especially since I was lowk a crashout in high school I wanted to know if it is worth it for me to go? It's quite expensive, and I will be the only one…

30
r/MachineLearning community 28d ago

NeurIPS Reciprocal Reviewers be careful in reviewing with LLMs [D]

As the title says. I am not a reciprocal reviewer but I just noticed a clever prompt injection like they did in ICML for our submission.   submitted by   /u/Massive-Bobcat-5363 [link]   [comments]

18
r/MachineLearning community 28d ago

How are production ML systems typically handling distribution shift over time? [D]

In deployed ML systems, data distribution drift seems unavoidable over longer time horizons. I’m trying to understand what approaches are commonly used in practice: Continuous retraining pipelines (fixed intervals vs trigger-based) Online monitoring for feature or prediction…

25
r/MachineLearning community 29d ago

NeurIPS used uncalibrated AI detector for desk rejections [D]

I recently had a submission desk-rejected from the NeurIPS 2026 Position Paper Track for an alleged AI-policy violation. After corresponding with the track leadership and reading their public blog post, I think the broader methodological issue is worth discussing here. The track…

13
r/MachineLearning community 29d ago

Analysis of AlphaZero training data [D]

I am trying to train an AlphaZero model for Othello on a 6x6-board. Having been warned that too little exploration during data generation can lead to models being overconfident and trapped in some tight region of the search tree, I started with the value c_puct = 4.0, and then…

35
r/MachineLearning community 29d ago

A semantic tokenization scheme where token geometry reflects semantic relationships [R]

I have been thinking about an alternative tokenization and representation scheme for language models and would be interested in hearing whether similar ideas have been explored before, as well as potential advantages or flaws. The core observation is that modern tokenizers (BPE,…

30
r/MachineLearning community 29d ago

Encodec.cpp, a portable C++ implementation of Meta's EnCodec using Eigen [P]

I built a C++ implementation of Meta’s EnCodec using Eigen . Github: https://github.com/pfeatherstone/encodec.cpp Motivation: - A lightweight implementation of EnCodec with no runtime dependencies, in C++ - No ML runtime - Easy integration in CMake project - Maximum performance…

7
r/MachineLearning community 29d ago

TorchDAE: Implicit DAE Solvers with Index Reduction and Adjoint Sensitivity [P]

Hello everyone, I've been working on a PyTorch library for solving Differential Algebraic Equations (DAEs) that supports vectorized execution and GPU acceleration. The library implements several algorithms that are not currently available in the Python ecosystem, including…

27
r/MachineLearning community 29d ago

MiniMax dropped a new attention architecture. [N]

It contains something interesting about context windows. They’re natively scaling to 1M tokens with MiniMax Sparse Attention (MSA) , bypassing standard quadratic complexity by completely restructuring the memory access patterns at the operator level. Instead of relying on…

26
r/MachineLearning community 1mo ago

Thoughts on Logical Intelligence’s Kona [D]

Sometime late last year a company called Logical Intelligence developed an EBM called Kona. What do people make of the company’s claims that they have a close to functioning EBM. And if true, what impact would this have on existing AI?   submitted by   /u/Treey1234…

24
r/MachineLearning community 1mo ago

MTPAMI Survey Paper Length for submission time? [D]

My paper is around 33 pages including but tpami guideline said it should be 20 pages Does anyone know which is correct? Its mistake it’s TPAMI   submitted by   /u/Alternative_Art2984 [link]   [comments]

30
r/MachineLearning community 1mo ago

Is the hallucination problem solved for document search? [D]

I was wondering if someone knew state of the art research about the hallucination problem for document search with LLMs. I know for example in math you can use some verifier to check a proof. What about document search with LLMs, when I feed them documents?   submitted by…

23
r/MachineLearning community 1mo ago

Backpropagation destroys V1 brain alignment in one epoch, tracking RSA alignment to fMRI across training for BP, FA, predictive coding, and STDP [R]

Third in a series of papers tracking learning rules vs. human fMRI (THINGS dataset, V1–IT, N=3 subjects). Previous finding: untrained CNNs match backprop at V1. This paper asks: when does training break that, and does the learning rule matter? Setup: RSA alignment measured at 8…

30
r/MachineLearning community 1mo ago

LLM agents patch security bugs, pass all tests, but still leave the vulnerability open [R]

I built CVE-Bench: 20 real-world CVEs across 18 Python projects (Pillow, GitPython, yt-dlp, urllib3, others), 5 frontier models, 3 prompt conditions, 300 runs total. Each agent runs in a sandboxed container and is scored against a hidden test_security.py derived from the…

4
r/MachineLearning community 1mo ago

Browse CVPR 2026 papers on PapersWithCode [P]

https://preview.redd.it/se5nr2z7tt4h1.png?width=3046&format=png&auto=webp&s=7db15b73afb749da236e5bb50ff96372f6a3239b Hi, Niels here from the open-source team at Hugging Face. It's been 2 weeks since I launched paperswithcode.co , a revival of the website we all loved. It allows…

11
r/MachineLearning community 1mo ago

I scraped over 2 million job postings across 100,000+ company career sites into a unified, daily-updated dataset. [P]

Over the past few months, I've been working on a high-scale scraping pipeline to aggregate listings directly from company job boards and applicant tracking systems. Mapping over 100,000 distinct companies to their career pages turned out to be a massive engineering headache, but…

32
r/MachineLearning community 1mo ago

[D] Self-Promotion Thread

Please post your personal projects, startups, product placements, collaboration needs, blogs etc. Please mention the payment and pricing requirements for products and services. Please do not post link shorteners, link aggregator websites , or auto-subscribe links. -- Any abuse…

22
r/MachineLearning community 1mo ago

MeshFlow: production-safe multi-agent orchestration — SHA-256 audit chain, HIPAA/SOX/GDPR built in, 70-85% token cost reduction [Open Source][D]

79% of enterprises have adopted AI agents. Only 11% run them in production. We've spent the past year building agent systems for banks, clinical operations teams, and engineering orgs. The problem isn't that agents don't work — they work fine. The problem is that every framework…

12
r/MachineLearning community 1mo ago

MeshFlow: An open-source orchestrator for governed, cost-optimized multi-agent workflows [D]

Hey ML community, We’ve just open-sourced **MeshFlow** , a code-first, framework-agnostic runtime designed for governing and optimizing multi-agent systems in production. Most agent frameworks focus on rapid prototyping, but ML and platform engineering teams usually run into…

23
r/MachineLearning community 1mo ago

ICML Conference Ticket (looking to purchase) [D]

Hi everyone, I missed the ICML conference tickets because I was waiting for some travel funding confirmation and now they are sold out. Do you know any other ways I could still purchase one? There seems to be no waiting list… or if you know anyone who needs to cancel theirs,…

37
r/MachineLearning community 1mo ago

Full duplex vs half duplex - the spectrum of AI voice models [D]

It seems that there are two ways to build voice AI: Half-duplex: strict turn-taking. You speak, the other side waits until you’re done, one direction of speech at a time. ← This is how almost every voice assistant works today. Full-duplex: two channels, both sides can talk at…

32
r/MachineLearning community 1mo ago

Feedback on my EU AI Act Risk Tier Assessor [P]

Hey everyone, hope this is ok to post here. I built a free EU AI Act risk assessment tool and would love some feedback from people who actually know this space. You fill out a 10-question form describing your AI system, it classifies your EU AI Act risk tier, and emails you a…

35
r/MachineLearning community 1mo ago

Why our #1 LightGBM feature by importance made predictions worse [D]

We recently hit a classic gradient boosting trap with our pricing engine (Flyback), and I wanted to share the ablation data. We run LightGBM quantile regression to forecast secondary market watch prices. We engineered a variant-conditioned Bayesian target encoder to isolate…

7
r/MachineLearning community 1mo ago

ICML Financial Aid [D]

Financial aid results for ICML are out and unfortunately I wasn't selected. I was wondering, does this mean I wasn't selected for Volunteering as well? Or should I expect a separate email?   submitted by   /u/RussB3ar [link]   [comments]

34
r/MachineLearning community 1mo ago

Finetuning a Reasoning LLM with Supervised or Reinforcement Learning? [D]

Hello, I have a task to fine-tune small LLMs on annotated conversational data. The dataset contains not only the final answers, but also reasoning traces and tool-calling decisions (i.e., when the model should think and when it should call a tool). I am wondering what the best…

34
r/MachineLearning community 1mo ago

Real-time multilingual ASR using rolling buffers and monolingual models [P]

I built a routing-based approach to lightweight real-time multilingual ASR as part of my research at Gladia. The core problem was how multilingual models that accurately handle mid-conversation language switches are often too big for most local hardware and have poor accuracy.…

36
r/MachineLearning community 1mo ago

[D] Simple Questions Thread

Please post your questions here instead of creating a new thread. Encourage others who create new posts for questions to post here instead! Thread will stay alive until next one so keep posting after the date in the title. Thanks to everyone for answering questions in the…

32
r/MachineLearning community 1mo ago

How much of MLE-Bench's gains are the algorithm vs. better models + more search? [R]

MLE-Bench scores have jumped from 30% to 80% over the last two years. But how much of that is real algorithmic progress vs. better base models + problem definition shifts + overfitting? Turns out: not much. Once you control for the same step budget and models, and then test on a…

5
r/MachineLearning community 1mo ago

5060 Ti 16GB or Cloud: Which makes more sense for DL, RL, and LLM studies/research? [D]

Hi everyone, If you have purchased (at least one) GPU(s) for ML/DL studies and research: How is your experience and is it worth it? What do you use it for and how is the ROI? I have a MacBook Pro with M4 from some years ago, while MPS is useful in many occasions, it's no…

29
r/MachineLearning community 1mo ago

Do you see GNN's playing a meaningful role in astrophysics research? [D]

A bit of background about myself: I have been accepted to RWTH Aachen's Computer Science program starting this fall, and one of the things that I am genuinly excited about is exploring the intersection of astrophysics and machine learning. The tricky part is that RWTH's CS…

13
r/MachineLearning community 1mo ago

[P] Free AI Agent Security Assessment [P]

Hey everyone, We’re building Antitech , a security layer for AI agents and LLM-powered workflows. We’re opening a small number of free early-access assessments for teams/builders working on AI agents. If you give us access to an endpoint of a Dockerized / sandboxed environment…

8
r/MachineLearning community 1mo ago

Have you ever been pressured to "torture the data" to eke out a positive result, in industry? [D]

Without revealing too much information, what were the circumstances?   submitted by   /u/XTXinverseXTY [link]   [comments]

31

Is it allowed to use OpenAI API outputs to create a silver code dataset or benchmark for a specific Python library? [d]

Scrap the LLMs. Scoring 4.76% on the brand new ARC-3 using pure code, a 2012 AMD CPU, and zero AI tokens.[P]

[R] Measuring the Symmetry--Data Exchange Rate

How do ML researchers actually use AI tools to improve their writing? [D]

We built a source-available LLM reliability library (free for research / personal / internal eval) that can cut inference cost by half at matched quality, and you adopt it by changing one import [P] [R]

[P]Stop using print() to debug your agents. Here's a 60-second alternative.[P]

Faithful uncertainty in LLM agents: calibration vs utility tradeoff in practice[D]

KVarN: Variance-Normalized KV-Cache Quantization [R]

On-policy distillation: one of the hottest terms on PapersWithCode [R]

ICML financial aid [D]

How Do You Handle Ablation Studies When the Original Model Is Already Trained?[R]

Embedding space [D]

Repo for implementations of various Transformer Attn mechanisms [P]

Research in Image/Video Gen AI models [D]

In current ML systems, where is the main bottleneck: dataset quality or model architecture improvements? [D]

Best Visual Reasoning Model in 2026 (Including APIs) [D]

I have done a ML Project as a Novice [P]

Has anyone heard back from citadel ICML travel grant ? [D]

First paper acceptance (ICML Workshop), should I attend? [D]

NeurIPS Reciprocal Reviewers be careful in reviewing with LLMs [D]

How are production ML systems typically handling distribution shift over time? [D]

NeurIPS used uncalibrated AI detector for desk rejections [D]

Analysis of AlphaZero training data [D]

A semantic tokenization scheme where token geometry reflects semantic relationships [R]

Encodec.cpp, a portable C++ implementation of Meta's EnCodec using Eigen [P]

TorchDAE: Implicit DAE Solvers with Index Reduction and Adjoint Sensitivity [P]

MiniMax dropped a new attention architecture. [N]

Thoughts on Logical Intelligence’s Kona [D]

MTPAMI Survey Paper Length for submission time? [D]

Is the hallucination problem solved for document search? [D]

Backpropagation destroys V1 brain alignment in one epoch, tracking RSA alignment to fMRI across training for BP, FA, predictive coding, and STDP [R]

LLM agents patch security bugs, pass all tests, but still leave the vulnerability open [R]

Browse CVPR 2026 papers on PapersWithCode [P]

I scraped over 2 million job postings across 100,000+ company career sites into a unified, daily-updated dataset. [P]

[D] Self-Promotion Thread

MeshFlow: production-safe multi-agent orchestration — SHA-256 audit chain, HIPAA/SOX/GDPR built in, 70-85% token cost reduction [Open Source][D]

MeshFlow: An open-source orchestrator for governed, cost-optimized multi-agent workflows [D]

ICML Conference Ticket (looking to purchase) [D]

Full duplex vs half duplex - the spectrum of AI voice models [D]

Feedback on my EU AI Act Risk Tier Assessor [P]

Why our #1 LightGBM feature by importance made predictions worse [D]

ICML Financial Aid [D]

Finetuning a Reasoning LLM with Supervised or Reinforcement Learning? [D]

Real-time multilingual ASR using rolling buffers and monolingual models [P]

[D] Simple Questions Thread

How much of MLE-Bench's gains are the algorithm vs. better models + more search? [R]

5060 Ti 16GB or Cloud: Which makes more sense for DL, RL, and LLM studies/research? [D]

Do you see GNN's playing a meaningful role in astrophysics research? [D]

[P] Free AI Agent Security Assessment [P]

Have you ever been pressured to "torture the data" to eke out a positive result, in industry? [D]