r/MachineLearning

500 articles archived · Visit source ↗ · RSS

r/MachineLearning community 1mo ago

Spice: We built an open-sourced decision layer that sits above your AI agents (controls agent actions before execution) [P]

Hi guys, been exploring here for a while, wanted to share something we've been working on. It's called Spice , an open-source decision layer above agents. We have tons of great execution agents now — Claude Code, Codex, hermes, etc. They're good at doing stuff. But they're…

6
r/MachineLearning community 1mo ago

I built a Mamba1 variant I call SM1 with d_state=1 that runs on Blackwell in pure PyTorch [P]

On windows mamba-ssm is not easily available and doesn't compile on sm_120. SM1 (Scalar Mamba1) replaces the entire selective scan with two native PyTorch ops: L = torch.cumprod(dA, dim=1) h = L * (h0.unsqueeze(1) + torch.cumsum(dBx / L.clamp(min=1e-6), dim=1)) y = h * C This is…

21
r/MachineLearning community 1mo ago

Tested chunking + embeddings data from 3 production websites. [P]

Tiered + page-role-aware RAG retrieval results across 3 corpora with very different content density: Workspace Sources Chunks HIGH MEDIUM LOW REJECTED Intercom 188 941 96 200 541 104 HubSpot 251 1705 40 508 1153 4 KPMG 53 209 3 14 127 65 (HIGH = avg operational score 0.84,…

19
r/MachineLearning community 1mo ago

LLMs are just giant probability machines pretending to think [P]

It’s fascinating that simple mathematics between tokens can eventually become a machine that writes essays, code, poetry, and even reasoning. We usually think probability means uncertainty. But LLMs show something strange: If probability + context + mathematical matching are…

36
r/MachineLearning community 1mo ago

Anthropic posted a profit while xAI burned $4.2B. The AI profitability numbers finally leaked.[D]

This week basically forced everyone to stop guessing about AI margins. Three major financial reality checks hit at once: OpenAI confidentially filing their S-1, xAI’s Q1 numbers leaking via SpaceX, and Anthropic somehow posting an actual operating profit. If you are building an…

4
r/MachineLearning community 1mo ago

LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P]

Solo author here. I spent the last six months building (and then sunsetting) a marketplace for AI training data. The marketplace failed for an interesting reason: the actual bottleneck isn't supply. There's tons of data. The bottleneck is that buyers can't independently evaluate…

14
r/MachineLearning community 1mo ago

Anonymous Data Upload for Submission [D]

How do you upload data anonymously for a submission (ACL/EMNLP)? I have several models I need to upload for replication and was thinking HuggingFace, but HF offers download tracking on a paid plan. Does this violate the policy since there is the potential of tracking the…

18
r/MachineLearning community 1mo ago

Looking for arXiv endorsement + sharing a preprint on homeostatic cognitive architecture for AI companions [R]

Hey r/ML — I just posted a preprint on SSRN for PHI // DRIFT, a cognitive architecture that gives an AI companion persistent internal state, salience-weighted memory retrieval, and a falsifiable continuity metric (PEDI). Ablation testing confirmed the DMU memory system injects…

13
r/MachineLearning community 1mo ago

Could ML be used to automate C-suite organizational duties? [D]

We often see worry from workers that ML techniques will either fully replace them, or jostle them violently economically such that their earnings and well-being are impacted. Concurrently, many tech companies resist unionization/"guild" efforts to protect the careers of…

11
r/MachineLearning community 1mo ago

Custom image encoder [P]

Hello, I would like to know whether building my own image encoder would be a good idea instead of using models like CLIP, SigLIP/SigLIP2, or DINO. My use case is video frame classification. My pipeline is the following: the client sends me a video stream, sampled at 1 frame per…

5
r/MachineLearning community 1mo ago

COLM 2026 ReviewsDiscussion [D]

Didn't see one so wanted to make one myself. Reviews are actually already out, curious what everyone thinks about the quality of the reviews? I've heard it's a mixed bag and apparently a concerning amount of AI generated reviews for some people.   submitted by  …

33
r/MachineLearning community 1mo ago

NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable) [P]

Disclaimer: I work for Numind, the company behind this open-weight model We just released a 4B model based on Qwen3.5-4B, under Apache-2.0 license. The goal is to make information extraction from complex documents more practical with an open model: PDFs, screenshots, forms,…

12
r/MachineLearning community 1mo ago

One thing that's been bothering me lately: benchmark performance often tells me almost nothing about whether a workflow will survive production usage.[D]

I've seen systems score well internally and then immediately fail under: ambiguous user intent messy real-world context contradictory instructions long-running sessions Feels like evaluation still heavily rewards clean-task optimization instead of behavioral robustness. What are…

26
r/MachineLearning community 1mo ago

Live Human Detector on Outbound Phone Calls [R]

Goal To save humans wasting time sitting in Call Centre queues waiting to be answered To have tool listen in on the audio stream of a live call, post IVR Navigation - to determine whether the call has transitioned out of the queue and to a live person. Requirements The tool must…

20
r/MachineLearning community 1mo ago

Novel Problems in VLA [R]

I'm currently doing a research internship and my supervisor is constantly pushing me to have a novel idea, I've read about 15-20 papers about VLA and I think that most of the things are saturated, I thought about an equivariant VLA based on equivariant CNN which was published in…

21
r/MachineLearning community 1mo ago

Can liveness detection models generalise to synthetic media generation techniques they were never trained on? [D]

Most liveness detection systems in production today were built around a threat model where the attacker is submitting a static image or a basic replay video. The generation quality of current synthetic media is categorically different from what those training datasets captured.…

32
r/MachineLearning community 1mo ago

using .npy dataset with 3D models [R]

Hello guys , i am trying to work on ADNI dataset to get 90% accuracy , but it keeps getting stuck at 55%. any tip to improve results ?   submitted by   /u/LahmeriMohamed [link]   [comments]

4
r/MachineLearning community 1mo ago

Lisbon Machine Learning School (LxMLS 2026) [D]

Hi did anyone apply it, or attended it previously? How was the experience? I got the acceptance but no scholarship, is it worth going self sponsored?   submitted by   /u/Icy-Solid-4159 [link]   [comments]

21
r/MachineLearning community 1mo ago

I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]

RPS is inspired by neuroscience. As humans, we learn basic skills as kids with high neuro-plasticity. We then learn advanced skills as teens and adults with low neuro-plasticity. RPS trains a model in 2 stages. In stage 1, the model is trained on easy data with high learning…

26
r/MachineLearning community 1mo ago

Does this idea sound fun? [R]

It's about inference-time learning by inserting some experts specialized for updating sibling expert weights in MoE. All the components needed were already there, but no one tried it inside MoE, so I did a small PoC. It kinda worked. I'd love to hear what you think.…

33
r/MachineLearning community 1mo ago

Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D]

The research community has provided (already for some time) seemingly more efficient and effective tokenizations for vision. Do we have any hint on whether non-fixed-patches tokenization is being applied on the big player models? I imagine not, and I'm trying to think why: -…

7
r/MachineLearning community 1mo ago

Looking for real world comparisons between WALL OSS pi0.6 and OpenVLA[D]

I am choosing a baseline for a real manipulation stack and trying not to lose a month on setup that someone here has already done. Shortlist is OpenVLA, pi0.6, and WALL OSS from X Square Robot. OpenVLA is still the easiest reference point with lots of reproductions. pi0.6 looks…

21
r/MachineLearning community 1mo ago

Columbia Machine Learning Summer School (MLSS) 2026 [D]

I got into this CFE MLSS 2026 and would like to connect with people who also got into it or have been in previous cohorts! I am organizing a group chat for people who got into the program :DD https://cfe.columbia.edu/content/mlss   submitted by   /u/elucidativemind…

24
r/MachineLearning community 1mo ago

High E2E latency on fine-tuned Gemma 4 26B despite low TTFT [R]

Recently fine-tuned a Gemma 4 26B model, and I’m seeing surprisingly high end-to-end latency despite the effective inference footprint being much smaller (~4B-ish behavior during serving). Current setup: Model: Gemma 4 26B (fine-tuned) Engine: vLLM Quantization: FP8 Hardware:…

27
r/MachineLearning community 1mo ago

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]

Autoregressive LLM world models factorize next-state generation left-to-right, preventing them from conditioning on globally interdependent anchors (tool schemas, trailing status fields, expected outcomes) and yielding prefix-consistent but globally incoherent rollouts. MDLMs'…

28
r/MachineLearning community 1mo ago

l9gpu - open-source GPU observability with workload-level attribution [P]

GPU monitoring tools like DCGM give you hardware-level metrics but no workload context. When a node is saturated, you can't tell which experiment, team, or job is responsible without digging through logs. We built l9gpu to close that gap. It's a node-level agent that exports GPU…

25
r/MachineLearning community 1mo ago

OpenAI claims a general-purpose reasoning model found a counterexample to Erdos's unit-distance bound [D]

OpenAI posted a math result today claiming that one of its general-purpose reasoning models found a construction disproving the conjectured n^{1+O(1/log log n)} upper bound in Erdős’s planar unit-distance problem. Announcement:…

31
r/MachineLearning community 1mo ago

LLMs and Emojis [D]

LLMs are trained on human data, so where does the tendency to add emojis come from? For example, when some models generate code explanations or even normal responses, they often add lots of emojis that people don’t really use that way in real life. My current guess (without…

33
r/MachineLearning community 1mo ago

How competitive are PhD admissions currently [D]

Hi, how hard is it currently to get a PhD position in machine Learning? Like what are the requirements to get to a decent mid tier program (= they publish regularly at respected journals and their work gets read my some people)? How is it in different regions e.g US, Europe,…

10
r/MachineLearning community 1mo ago

Should I accept a PhD offer in NeuroAI [D]

Hi everyone. I am recent CS grad and I have received a PhD offer from a school in states. However I am deeply confused if I should accept it or not. My hesitation comes from the interdisciplinary nature of the program. It will be jointly supervised by the two professors, one…

27
r/MachineLearning community 1mo ago

Splitting data by label for FAISS [P]

If I have a labeled dataset Is it possible to split my data by label where each chunk is the sentences of one label and then use this to be able to label more sentences. And is this even a good idea for data labeling where I search for this certain sentence and see what the…

25
r/MachineLearning community 1mo ago

under 2% quality gap but 10x cost difference: tested 5 models on identical tool calling tasks[D]

I've been running a file management agent built on MCP for a few months. It handles module renames, import updates, validation scaffolding, test execution. A typical session is 60 to 120 tool calls. The whole thing was powered by Opus 4.7 because I never thought to question it…

8
r/MachineLearning community 1mo ago

Any tool to get accepted conference papers sorted by citation count? [D]

Ie given a conference (say with openreview data) eg “NeurIPS, 2025”, return the accepted papers based on number of citations according to standard paper search engine (eg google scholar) Seems to be a surprisingly difficult thing to find online.   submitted by  …

16
r/MachineLearning community 1mo ago

NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]

I built a custom RL algorithm for continuous flight control and open-sourced it. Sharing here in case the structural ideas are useful for anyone doing continuous control where one action axis dominates. I've been training continuous control on a 6-DoF flight sim…

31
r/MachineLearning community 1mo ago

CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution [R]

LLM-based multi-agent systems have demonstrated strong performance across complex real-world tasks, such as software engineering, predictive modeling, and retrieval-augmented generation. Yet, automating their configuration remains a structural challenge. Researchers are often…

17
r/MachineLearning community 1mo ago

Machine Learning on Spherical Manifold [R]

Hi, I'm interested in geometric deep learning (due to Michael M. Bronstein's book and Maurice Weiler's PhD thesis), and in order not to write projects to nowhere, I decided to keep a technical blog. I started with a short note about machine learning on spherical manifolds, but…

34
r/MachineLearning community 1mo ago

Instructions for (ICML) workshop reviews [D]

Hi, I am being reviewer for an ICML workshop; however, there are no guidelines on the structure of the reviews (e.g. what are the criteria, what is the grade scale, etc.). Does anyone know whether ICML workshops have some "convention" regardings reviews? Or do we ought to use…

22
r/MachineLearning community 1mo ago

ICML Proceedings-only [D]

For proceedings-only papers, do we need to make a poster and submit it to the portal? Has anyone asked this question to ICML Program Chair?   submitted by   /u/minhquang251 [link]   [comments]

34
r/MachineLearning community 1mo ago

[ECCV 2026] No modified date next to reviews [D]

On Openreview, you can see modified date next to the review. This modified date should be recent (anything 12th May or newer) which means that reviewer gave a final justification and may have increased their score or kept the same score. In either case, it means they read the…

29
r/MachineLearning community 1mo ago

Comparing data annotation platforms [D]

Scale AI Highest quality in the industry. But no public pricing and every project requires a sales call. Onboarding takes weeks not days. In June 2025 Meta bought a 49% stake and hired Scale’s CEO as Meta’s Chief AI Officer. Several major customers quietly reduced engagements…

27
r/MachineLearning community 1mo ago

I built a tool that shows you what GPT-2 is "thinking" in real-time as it generates 3D graph of concept activations per token [R]

Been going down a mechanistic interpretability rabbit hole for the past few weeks and ended up building this thing called AXON. The idea: every time GPT-2 generates a token, its residual stream gets passed through a Sparse Autoencoder (Joseph Bloom's pretrained SAE). The SAE…

15
r/MachineLearning community 1mo ago

LxMLS 2026 decision [D]

Has anyone applied to Lxmls 2026? Did you get any update?   submitted by   /u/No_Cardiologist7609 [link]   [comments]

34
r/MachineLearning community 1mo ago

Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

Wanted to see how close a fully bio-plausible agent could get to PPO on Pong. Setup Custom Pong environment (pygame, no gym) PPO baseline: paper-faithful, from scratch Hebbian agent: PPO policy replaced with Hebbian value estimation engineered features → 61% BioAgent: Predictive…

26
r/MachineLearning community 1mo ago

What do you think about Tabular Foundation Models [D]

I've seen TabPFN-3's recent results, and there is a lot of buzz about foundation models for tabular data (TabICL, TabPFN). The performance that those models achieve is really amazing. What makes me a little suspicious about them? They can analyze small datasets only, so a few MB…

7
r/MachineLearning community 1mo ago

Graph spectral analysis (Fiedler value + Scheffer CSD indicators) predicts grokking 21k steps before loss function - five reproducible experiments [R]

I've been applying the Fiedler value (second-smallest eigenvalue of the weight graph Laplacian) combined with Scheffer critical slowing down indicators to monitor neural network topology during training. Five experiments, all reproducible on CPU in under 24 hours: Detection:…

16
r/MachineLearning community 1mo ago

All fundamental knowledge in ML Course by Andrew NG that I noted and create into a repo github [R]

https://preview.redd.it/mikhasjiq32h1.png?width=572&format=png&auto=webp&s=4c053200dbd9852bebf083550e2144b31579d497 https://preview.redd.it/bay5r3njq32h1.png?width=575&format=png&auto=webp&s=2823db3d6bc534ef00330528a200cba2aca1c5d3…

4
r/MachineLearning community 1mo ago

How does loss functions work in PINN? [D]

I am learning Physics informed neural network (PINN). I am playing with simple 1rst/2nd 1D ODEs and I am calculating the loss functions by adding the initial condition loss and Physics loss (e.g. Total loss = lambda1 (L1) * Physics_loss (PL) + lambda2 (L2) * IC_loss (IL)).…

30
r/MachineLearning community 1mo ago

Feeling lost while trying to break into AI/ML how should I focus my projects? [D]

I’m trying to break into AI/ML Engineer / Applied AI roles, and honestly I’ve been feeling pretty overwhelmed lately. I’ve been building around LLM evaluation, model reliability, cost optimization, and production AI systems. My main projects are: RDAB — a benchmark for…

34
r/MachineLearning community 1mo ago

A Simple Solution to Improve Broken Peer Review System at AI Conferences [R]

An issue with the peer review system is reciprocal reviewing, which incentivizes reviewers to unfairly reject good papers to increase their own papers' chances of acceptance. My proposed solution is that the conference should divide the authors/papers into 2 halves (A and B). If…

16
r/MachineLearning community 1mo ago

How to get rejected by IEEE T-PAMI with 'Excellent' scores?[D]

Hello everyone. I am keeping my identity anonymous today to protect my professional career. I am a junior researcher in Computer Vision, and I am sharing this story because I have hit a devastating deadlock with IEEE T-PAMI and the IEEE Ethics Office. Our Situation:…

13

Spice: We built an open-sourced decision layer that sits above your AI agents (controls agent actions before execution) [P]

I built a Mamba1 variant I call SM1 with d_state=1 that runs on Blackwell in pure PyTorch [P]

Tested chunking + embeddings data from 3 production websites. [P]

LLMs are just giant probability machines pretending to think [P]

Anthropic posted a profit while xAI burned $4.2B. The AI profitability numbers finally leaked.[D]

LQS v3.1 — an open methodology for rating AI training data (multi-oracle consensus + signed certificates) [P]

Anonymous Data Upload for Submission [D]

Looking for arXiv endorsement + sharing a preprint on homeostatic cognitive architecture for AI companions [R]

Could ML be used to automate C-suite organizational duties? [D]

Custom image encoder [P]

COLM 2026 ReviewsDiscussion [D]

NuExtract3 released: open-weight 4B VLM for Markdown, OCR and structured extraction (self-hostable) [P]

One thing that's been bothering me lately: benchmark performance often tells me almost nothing about whether a workflow will survive production usage.[D]

Live Human Detector on Outbound Phone Calls [R]

Novel Problems in VLA [R]

Can liveness detection models generalise to synthetic media generation techniques they were never trained on? [D]

using .npy dataset with 3D models [R]

Lisbon Machine Learning School (LxMLS 2026) [D]

I created an LLM post-training method called RPS. Preliminary results show that it improved Qwen3-8b's program synthesis reliability. [R]

Does this idea sound fun? [R]

Do VLMs in production still use fixed-patch ViTs for their vision capabilities? [D]

Looking for real world comparisons between WALL OSS pi0.6 and OpenVLA[D]

Columbia Machine Learning Summer School (MLSS) 2026 [D]

High E2E latency on fine-tuned Gemma 4 26B despite low TTFT [R]

Masked Diffusion Language Models are Strong and Steerable Text-Based World Models for Agentic RL [R]

l9gpu - open-source GPU observability with workload-level attribution [P]

OpenAI claims a general-purpose reasoning model found a counterexample to Erdos's unit-distance bound [D]

LLMs and Emojis [D]

How competitive are PhD admissions currently [D]

Should I accept a PhD offer in NeuroAI [D]

Splitting data by label for FAISS [P]

under 2% quality gap but 10x cost difference: tested 5 models on identical tool calling tasks[D]

Any tool to get accepted conference papers sorted by citation count? [D]

NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]

CANTANTE: Optimizing Agentic Systems via Contrastive Credit Attribution [R]

Machine Learning on Spherical Manifold [R]

Instructions for (ICML) workshop reviews [D]

ICML Proceedings-only [D]

[ECCV 2026] No modified date next to reviews [D]

Comparing data annotation platforms [D]

I built a tool that shows you what GPT-2 is "thinking" in real-time as it generates 3D graph of concept activations per token [R]

LxMLS 2026 decision [D]

Backprop-free Pong: PC + distributional Hebbian plasticity vs. PPO: 57% vs. 59%, ~1500 lines from scratch [P]

What do you think about Tabular Foundation Models [D]

Graph spectral analysis (Fiedler value + Scheffer CSD indicators) predicts grokking 21k steps before loss function - five reproducible experiments [R]

All fundamental knowledge in ML Course by Andrew NG that I noted and create into a repo github [R]

How does loss functions work in PINN? [D]

Feeling lost while trying to break into AI/ML how should I focus my projects? [D]

A Simple Solution to Improve Broken Peer Review System at AI Conferences [R]

How to get rejected by IEEE T-PAMI with 'Excellent' scores?[D]