r/MachineLearning

500 articles archived · Visit source ↗ · RSS

r/MachineLearning community 1mo ago

Physics Informed Neural Networks for damped harmonic oscillator and Burger's Equation (with extrapolation analysis) [P]

I built a PINN implementation in Python to solve two problems as part of a physics exam project: the damped harmonic oscillator (2nd-order ODE) and the 1D viscid Burgers' equation (nonlinear PDE). Both forward and inverse problems (to estimate unknown equation parameters from…

37
r/MachineLearning community 1mo ago

noisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]

If you've ever tried to pick an STT vendor for a phone-based voice agent or call center product, you've probably hit this wall: you have plenty of real production audio, but it's unlabeled, so you can't compute WER on it. And the annotated public datasets (FLEURS, CommonVoice,…

31
r/MachineLearning community 1mo ago

EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Vision Transformers waste 90% of their compute recalculating stationary asphalt. NeuroFlow tracks semantic surprise in embedding space, physically eliminating background tokens before the encoder. NeuroFlow is a dynamic routing framework for Vision Transformer video inference.…

34
r/MachineLearning community 1mo ago

Cross-species RSA: same learning rules (BP, PC, STDP, FA) tested against both human fMRI and macaque electrophysiology [P]

Follow-up to my earlier post on learning rules vs. human fMRI. Same five conditions (BP, FA, PC, STDP, untrained), same model weights, now evaluated against macaque V1/V2 (FreemanZiemba2013, single-unit) and macaque V4/IT (MajajHong2015, multi-electrode). Main findings: Early…

23
r/MachineLearning community 1mo ago

Profiling PyTorch training without accidentally stalling the GPU [D]

Profiling PyTorch training has an interesting measurement problem: the more you measure, the more you can change the behavior of the run itself. A simple example is torch.cuda.synchronize() . It gives cleaner timing boundaries, but it also inserts synchronization points into an…

13
r/MachineLearning community 1mo ago

A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]

https://preview.redd.it/ww14mzr2fm3h1.png?width=1890&format=png&auto=webp&s=79873d47ae79c7815ca3e7e91fd43141632174f5 https://www.youtube.com/watch?v=rr_uS4bf0B4&feature=youtu.be trained a 7MB open-source L4 self-driving AI that learns navigation, lane following, and drift…

11
r/MachineLearning community 1mo ago

What to use for Sign Language Recognition [R]

Hi everyone, I'm finishing up my proposal for my undergraduate thesis for computer science on sign language recognition, specifically Filipino Sign Language and i want to ask what architecture to use for my methodology that is best, rn im considering Mediapipe Holistic +…

32
r/MachineLearning community 1mo ago

[R]GNN Model For Fraud Detection Isn't Performing Well[R]

We're writing a research paper on explainable fraud detection GNN model and in the first step we're creating a basic Graph Neural Network for that. We're using the most famous dataset available on this topic i.e IEEE CIS Fraud Detection Dataset and implemented all necessary…

7
r/MachineLearning community 1mo ago

[D] Is IEEE Workshop on Machine Learning for Signal Processing Reputable? [D]

I randomly came across this conference/workshop: IEEE Workshop on Machine Learning for Signal Processing. Is this a reputable conference and is it worthwhile to submit here vs. a workshop at an A* like ICML, NeurIPS, etc.?(I know these deadlines have passed, I have a paper…

37
r/MachineLearning community 1mo ago

Trouble exploring in ai/ml,idk where to being with [D]

So as the title says Context:I am a sophomore in computer science Have prior knowledge in maths(especially the relevant topics in ml) Good enough with numpy,pandas I don't really know where to start Ok internet every second guy is trying to make me earn 100k/year in 3 months…

24
r/MachineLearning community 1mo ago

Augmented Equivariant Mesh Networks for Anatomical Mesh Segmentation (ICML 2026 Workshops) [R]

Paper: https://arxiv.org/abs/2605.08172 Workshops: AI for Science & Structured Data for Health at ICML 2026 Abstract: Anatomical mesh segmentation requires models that operate directly on irregular surface geometry while remaining robust to arbitrary patient pose and mesh…

20
r/MachineLearning community 1mo ago

Tomesphere, 3M paper pages with TLDRs, peer reviews, code, and a SPECTER2 similarity graph [P]

Built a richer paper page for 3 million arxiv and OpenAlex papers. Free, no signup, no paywall. tomesphere.com Each page has a Gemini generated TLDR, peer reviews scraped from OpenReview with reviewer scores and decisions, GitHub repos, HuggingFace models and datasets,…

31
r/MachineLearning community 1mo ago

Verbosity is not faithfulness: an architectural argument that reasoning models cannot perform faithful inference [D]

Essay argues that reasoning models cannot perform faithful inference because their reasoning trace and final answer come from the same operation. Engages with Lanham/Turpin/Mirzadeh in empirical critique, and with HRM, TRM, GRAM, AlphaProof, and Kona/Aleph as the contrasting…

38
r/MachineLearning community 1mo ago

[P] have a couple technical questions for my LLM router. [P]

I am a CS undergrad and I think token economics is the next big problem for companies. I am building a LLM router specifically for code and codebases. The Routing is not actually done by a heavily fine tuned llm(already existing solutions do this). Using a bit of a different…

11
r/MachineLearning community 1mo ago

Added a Chrome Dino-style game to my research tool's pipeline wait screen driven by real SSE events [P]

Slightly unhinged engineering decision but it works. My tool (ScholarScout) has a 2-3 minute pipeline: fetch papers from 8 databases → analyze trends → generate ideas. During that time, the user sees a pixel art owl running through a parallax forest. The fun part: it's not fake…

10
r/MachineLearning community 1mo ago

[D] Dlib or pytorch to CNN? [D]

I’m currently studying ML, more specifically convolutional neural networks (CNNs) for finding patterns in images. Right now, I’m trying to develop a model that can solve the “Where’s Waldo?” challenge. However, I currently have a question: what would be the best option for…

31
r/MachineLearning community 1mo ago

[P] Built a portable GPU ISA after reading too many architecture manuals [P]

I’ve been reading GPU architecture docs in my free time. NVIDIA PTX, AMD ISA reference guides, Intel Xe, reverse-engineered Apple GPU stuff. Over 5,000 pages across 16 microarchitectures. After a while you notice all four vendors are doing the same 11 things with different…

5
r/MachineLearning community 1mo ago

[D] Where do you go for serious AI research discussion online? [D]

Looking for communities where people actually dig into ML/AI research, not hype, not "look what I built with an LLM API," but discussions about papers, training dynamics, debugging real models, infra problems, that kind of thing. I'm specifically interested in places where you…

15
r/MachineLearning community 1mo ago

Already 11 000 submissions for EMNLP? [D]

Is this normal? I searched it up and last year it was only 8000.   submitted by   /u/NightCR_ [link]   [comments]

24
r/MachineLearning community 1mo ago

Aiki my local Wikipedia Retrieval-Augmented Generation system [R]

Hey i built Aiki a lightweight tool that let's you chat with Wikipedia locally. what it does: - Downloads and chunks wikipedia articles (u can choose those articles by their name or articles and also the option of downloading the similar topics) - Uses a custom TF-IDF + cosine…

23
r/MachineLearning community 1mo ago

The famous METR AI time horizons graph contains numerous severe errors [D]

Nathan Witkin, a research writer at NYU Stern’s Tech and Society Lab, writes damningly about the famous METR AI time horizons graph in the Substack publication Transformer: It is impossible to draw meaningful conclusions from METR’s Long Tasks benchmark — in particular once one…

16
r/MachineLearning community 1mo ago

DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

Just thought I'd share, I ran a DCGAN on a dual core RISC-V microcontroller, the CH32H417 generating 64x64 cat faces. This is a new RISC-V MCU, so no TFLite, no CMSIS NN and no external memory. It's a pure C inference engine, bit-identical to PyTorch reference outputs. The model…

11
r/MachineLearning community 1mo ago

Is AI inference platform really that saturated now? [D]

I’m thinking of expanding an on-device inference SDk into a full blown AI inference platform and seeing more and more inference platform popping out. Been talking with a VC from Seattle/NY. Is this space really that saturated?   submitted by   /u/kampak212 [link]  …

35
r/MachineLearning community 1mo ago

Reconstructing the agent methodology: Decoupling decision-making and execution - open source [P]

I’ve been thinking about a problem in current agent systems: Most agents are becoming very good at execution, but the decision layer before execution is still unclear. Coding agents, research agents, tool loops, sandboxes, workflows, and harnesses are all improving quickly. Once…

38
r/MachineLearning community 1mo ago

𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬 [R]

We're excited to release 𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬, a drop-in upgrade to residual connections that learns which past layers to route from — without the routing collapse that breaks prior cross-layer attention at scale. 🚀 Attention Residuals route over…

9
r/MachineLearning community 1mo ago

Anyone heard from ICML about Oral decisions yet? [D]

hi all, my paper received a spotlight from ICML. they told us that we would receive decisions as to whether our paper would get an oral by the end of the month with the implication that we wouldn’t receive a notification if we didn’t get it; I was just wondering if anyone has…

30
r/MachineLearning community 1mo ago

I’m building an open-source decision layer above AI agents [P]

Hi everyone, I’m Jia, the creator of Spice. I’ve been working on an open-source project called Spice. The simplest way to describe it is: Spice is a decision layer above agents. Most agent systems today are very focused on execution, They are getting better at doing tasks after…

30
r/MachineLearning community 1mo ago

Call for Papers - Workshop on Efficient Reasoning at COLM 2026 [R]

🌟 Announcing the 2nd Workshop on Efficient Reasoning (ER) at @colm2026 — Oct 9! 📣 We welcome submissions! Submit your work here: https://openreview.net/group?id=colmweb.org/COLM/2026/Workshop/Efficient_Reasoning 🗓️ Deadline: July 12, 2026 (AoE) 🔗 Website:…

11
r/MachineLearning community 1mo ago

Best architecture for seamless Bilingual TTS? (Azure / English + Korean) [D]

Hi guys, when building a language learning app (React Native/Expo frontend, Python backend) and I’ve hit a frustrating wall with Text-to-Speech. I need the app to read sentences that mix English instructions and Korean examples (e.g., "To say hello, we use the phrase 안녕하세요.").…

20
r/MachineLearning community 1mo ago

Are ICML workshops worth attending? [D]

Hi! I missed securing a main conference ticket for ICML 2026, as my workshop paper got accepted two days ago. Do you believe that it is worth attending just workshops at such A*-tier conferences (with all the overseas travel costs etc.)? I was quite looking forward to attending…

31
r/MachineLearning community 1mo ago

Using large language models [R]

Can LLMs be used to come up with a research topic that's worthwhile? Has anyone had good results in coming up with solid research ideas by chatting with an LLM? Maybe using Claude to review existing work and define the research topic. Thanks!   submitted by  …

24
r/MachineLearning community 1mo ago

Call for Papers - Workshop on Unlearning and Model Editing U&ME at ECCV 2026 [R]

I have been seeing a lot of really interesting work lately around unlearning, model editing, controllability, safety, etc. Feels like this space is moving very fast right now, and there are still so many open questions. This year I’m helping organize the U&ME workshop at ECCV…

27
r/MachineLearning community 1mo ago

If you use NVIDIA Isaac Sim for reinforcement learning, do you use Isaac Lab with it? Just want to get a sense of what the status quo is. [D]

The reason for this query is that I am in the process of shifting to Isaac Sim / Isaac Lab since that is what seems to be in use nowadays. However, Isaac Lab is proving to be somewhat difficult to handle. While it handles the logging, and the creation of multi-actor systems for…

5
r/MachineLearning community 1mo ago

Sponsio: Deterministic Contract Layer for LLM Agents [P]

We've been trying to put LangGraph agents into production for a while. The thing that kept biting us was tool-call boundary enforcement: stuff like "must call X before Y", "max N retries", "approval gate before destructive action". Worked fine in demos, broke at the moments that…

31
r/MachineLearning community 1mo ago

Please help with tensor dock [d]

Anyone have any idea what I should do. This is my email to tensor dock. I developed corporate GPU benchmarking software so I need a cloud PC that can benchmark 5090 Consumer cards and 4090 Consumer cards. It worked absolutely amazing for six hours yesterday on the 4090 full…

28
r/MachineLearning community 1mo ago

"AI solved one of math's greatest challenges, but it cannot add two numbers reliably?!" [D]

Suppose your friend, a mathematician, woke up from a 5-year coma. How would you explain this to him? Do we even have an explanation other than "it is what it is"?   submitted by   /u/we_are_mammals [link]   [comments]

26
r/MachineLearning community 1mo ago

MergeNB: An intuitive merge conflict resolver built for Jupyter notebooks in VS Code [P]

I used to work heavily with Jupyter Notebooks + git + VS Code in a collaborative research setting and found nbdime to be somewhat buggy/a hassle to work with in general. So, in typical side project fashion ( relevant xkcd ) I've been working on MergeNB quite a bit over the last…

31
r/MachineLearning community 1mo ago

How do ML practitioners select hyperparameters, architectures, etc for self-supervised representation learning when the loss is non-monotonic? [D]

Non-contrastive SSL methods like BYOL/JEPA/data2vec seem promising, but I have no idea what is being learned, or how well; it’s models all the way down. Maybe I’ve got supervised tasks for which I’d like to see transfer, and I can evaluate linear probe/KNN results during…

26
r/MachineLearning community 1mo ago

Thermocompute constant time inference [P]

I invented thermocompute! It makes machine learning super fast!   submitted by   /u/arcco96 [link]   [comments]

30
r/MachineLearning community 1mo ago

Working on a cgo-free CUDA binding in Go for ML stuff Week 3 - open source [P]

At our work we use CUDA in Rust since the company switched to it recently. Rust has pretty good Driver API bindings but it made me wonder why the hell we cant have something decent in Go without cgo. I mostly build ML tools in the last month and Go is my main language for pretty…

30
r/MachineLearning community 1mo ago

PapersWithCode new features - week 1 [P]

Hi, Niels here from the open-source team at Hugging Face. It's been one week since I launched paperswithcode.co , a revival of the website we all loved. It allows us to keep track of the state-of-the-art (SOTA) across various domains of AI, from agents to computer vision and…

23
r/MachineLearning community 1mo ago

Expedia ML Scientist II interview experience anyone ? [D]

I have an Initial Technical Screen interview (45 Mins) coming up for ML Scientist II: Agentic AI role, and wanted to know what to expect. Would really appreciate any info. Haven't found much information on this interview experience. Thanks!   submitted by  …

27
r/MachineLearning community 1mo ago

Vision-capable LLMs vs. OCR for long-document (including charts, images, tables, etc.) QA [D]

I benchmarked vision-capable LLMs (the "just attach the PDF and let the model read it" pattern) against OCR-based pipelines on 30 long, image-heavy PDFs from MMLongBench-Doc ( https://github.com/mayubo2333/MMLongBench-Doc ). There were 171 questions in total, using Claude Sonnet…

9
r/MachineLearning community 1mo ago

Per-pixel bounding-box regression + DBSCAN for handwritten word detection - visual walkthrough of WordDetectorNet [P]

Overview of WordDetectorNN architecture. Sharing a visual breakdown of WordDetectorNet, Harald Scheidl's handwritten-word detection model. I think the design choice at its core is unusual enough to be worth a closer look - and I haven't seen it written up in detail anywhere…

26
r/MachineLearning community 1mo ago

I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]

Tested three formats: chat demos, first-person statements ("I am C-3PO..."), and synthetic Wikipedia-style docs. Same model, same LoRA config, 500 examples each. First-person statements won on generalization, which I didn't expect. The synthetic doc model was the weirdest…

6
r/MachineLearning community 1mo ago

pipeline is really slow - consulting [D]

Hi, after a long debugging process and many discussions, I wanted to ask for advice from people who may have encountered similar training bottlenecks. My goal is imitation learning for robotics. Model / Pipeline Observation space: 4 RGB robot cameras image resolution: 128x128x3…

25
r/MachineLearning community 1mo ago

AgentLantern: exposing the hidden graph of AI agent projects [P]

AI agent frameworks make it easy to create agents, tasks, tools, and workflows. But as soon as a project grows beyond a few agents, the real execution graph becomes difficult to understand. The issue : agent projects often hide their structure across code, YAML files, tool…

7
r/MachineLearning community 1mo ago

Hebbian architecture AI model [R]

Hello , for some time now i have been hooked on a side project after work hours, these are the results for a Hebbian architecture AI model. The model does not use backpropagation or gradients, the substrate started as a 1000k neuron and scaled to 100k between versions. The…

31
r/MachineLearning community 1mo ago

Alignment: Higher order prioritizing over constraints [R]

So, I ran across a behavior that I found interesting and may lead to alignment or safety research. I'm going to try to maintain an abstract description of what happened without giving away the details and the keys to jailbreaking. The nature of a transformer is to predict the…

25
r/MachineLearning community 1mo ago

Is personalized AI memory actually a problem worth solving or am I just coping[D]

genuine question for this community every time i use claude or chatgpt i have to re-explain myself. and even their memory feature is shallow it remembers facts about me, not how i actually think. the idea i've been sitting on is different from just "memory across sessions." what…

8

Physics Informed Neural Networks for damped harmonic oscillator and Burger's Equation (with extrapolation analysis) [P]

noisekit - CLI for generating realistic degraded speech datasets for ASR benchmarking [P]

EMA-Gated Temporal Sequence Compression in Vision Transformers [P]

Cross-species RSA: same learning rules (BP, PC, STDP, FA) tested against both human fMRI and macaque electrophysiology [P]

Profiling PyTorch training without accidentally stalling the GPU [D]

A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]

What to use for Sign Language Recognition [R]

[R]GNN Model For Fraud Detection Isn't Performing Well[R]

[D] Is IEEE Workshop on Machine Learning for Signal Processing Reputable? [D]

Trouble exploring in ai/ml,idk where to being with [D]

Augmented Equivariant Mesh Networks for Anatomical Mesh Segmentation (ICML 2026 Workshops) [R]

Tomesphere, 3M paper pages with TLDRs, peer reviews, code, and a SPECTER2 similarity graph [P]

Verbosity is not faithfulness: an architectural argument that reasoning models cannot perform faithful inference [D]

[P] have a couple technical questions for my LLM router. [P]

Added a Chrome Dino-style game to my research tool's pipeline wait screen driven by real SSE events [P]

[D] Dlib or pytorch to CNN? [D]

[P] Built a portable GPU ISA after reading too many architecture manuals [P]

[D] Where do you go for serious AI research discussion online? [D]

Already 11 000 submissions for EMNLP? [D]

Aiki my local Wikipedia Retrieval-Augmented Generation system [R]

The famous METR AI time horizons graph contains numerous severe errors [D]

DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

Is AI inference platform really that saturated now? [D]

Reconstructing the agent methodology: Decoupling decision-making and execution - open source [P]

𝐃𝐞𝐥𝐭𝐚 𝐀𝐭𝐭𝐞𝐧𝐭𝐢𝐨𝐧 𝐑𝐞𝐬𝐢𝐝𝐮𝐚𝐥𝐬 [R]

Anyone heard from ICML about Oral decisions yet? [D]

I’m building an open-source decision layer above AI agents [P]

Call for Papers - Workshop on Efficient Reasoning at COLM 2026 [R]

Best architecture for seamless Bilingual TTS? (Azure / English + Korean) [D]

Are ICML workshops worth attending? [D]

Using large language models [R]

Call for Papers - Workshop on Unlearning and Model Editing U&ME at ECCV 2026 [R]

If you use NVIDIA Isaac Sim for reinforcement learning, do you use Isaac Lab with it? Just want to get a sense of what the status quo is. [D]

Sponsio: Deterministic Contract Layer for LLM Agents [P]

Please help with tensor dock [d]

"AI solved one of math's greatest challenges, but it cannot add two numbers reliably?!" [D]

MergeNB: An intuitive merge conflict resolver built for Jupyter notebooks in VS Code [P]

How do ML practitioners select hyperparameters, architectures, etc for self-supervised representation learning when the loss is non-monotonic? [D]

Thermocompute constant time inference [P]

Working on a cgo-free CUDA binding in Go for ML stuff Week 3 - open source [P]

PapersWithCode new features - week 1 [P]

Expedia ML Scientist II interview experience anyone ? [D]

Vision-capable LLMs vs. OCR for long-document (including charts, images, tables, etc.) QA [D]

Per-pixel bounding-box regression + DBSCAN for handwritten word detection - visual walkthrough of WordDetectorNet [P]

I fine-tuned an LLM to be C-3PO to test which training data format works best for persona injection [P]

pipeline is really slow - consulting [D]

AgentLantern: exposing the hidden graph of AI agent projects [P]

Hebbian architecture AI model [R]

Alignment: Higher order prioritizing over constraints [R]

Is personalized AI memory actually a problem worth solving or am I just coping[D]