r/MachineLearning
500 articles archived · Visit source ↗ · RSS
-
r/MachineLearning community 1mo ago
What’s the actual focus in World Models right now? [R]
Hey everyone, I'm trying to get back into the loop on world models. The last time I followed SSL closely, the buzz was all about Barlow Twins and DINO, but now everything just looks like scaled-up video generation from big industry labs. What is the actual academic research…
36 -
r/MachineLearning community 1mo ago
Workshop on Unlearning and Model Editing U&ME at ECCV 2026 [R]
  submitted by   /u/Mushroom-Severe [link]   [comments]
21 -
r/MachineLearning community 1mo ago
Arabic ASR model struggling to converge during training [D]
i'm trying to train an ASR model using the LibriSpeech recipe from SpeechBrain (without the language model) on a 100-hour dataset of dialectal Arabic speech. the model architecture uses a Conformer-small encoder and a Transformer decoder, with a total of around 13M parameters.…
23 -
r/MachineLearning community 1mo ago
I built a tool to browse and plan CVPR workshop/tutorial days [P]
Hi everyone, as someone attending CVPR, one thing that always frustrated me was managing the workshop and tutorial days. The information is technically all there, but in practice it is scattered across dozens of workshop websites, PDFs, schedules, and program pages. I often…
22 -
r/MachineLearning community 1mo ago
Built an AI Accelerator and opensourced it. [P]
There is a huge gap in open source AI accelerators, so I implemented mine . Popular and well known ones are already legacy and doesn't support contemporary operations like Attention. Here is what makes mine special: Attention mechanism smelted directly into silicon Prototyped…
25 -
r/MachineLearning community 1mo ago
When are ICML openreviews made public? [R]
First time, so no idea.   submitted by   /u/camelCasedUser [link]   [comments]
28 -
r/MachineLearning community 1mo ago
How would you model this "strand" clustering problem? [P]
https://preview.redd.it/llqlupnwng4h1.png?width=2188&format=png&auto=webp&s=7fae5860babaffa1c8bfdcb1468b374eb38ac55d I'm currently building a computer vision application. I've managed to successfully train a YOLO model to detect the object I'm interested in for my videos. The…
33 -
r/MachineLearning community 1mo ago
[D] Monthly Who's Hiring and Who wants to be Hired?
For Job Postings please use this template Hiring: [Location], Salary:[], [Remote | Relocation], [Full Time | Contract | Part Time] and [Brief overview, what you're looking for] For Those looking for jobs please use this template Want to be Hired: [Location], Salary…
5 -
r/MachineLearning community 1mo ago
How to fine-tune an LLM for open-ended problems? [P]
I want to develop an LLM that can solve open-ended math problems (such as proof-only problems). This means that RLVR where we use the final answer alone as reward signal is not enough. Since SFT is useless here and GRPO/PPO methods will not have an appropriate reward function,…
34 -
r/MachineLearning community 1mo ago
Query about non-archival workshop at CVPR-2026 [R]
My paper was recently accepted to a workshop at CVPR-2026 as non-archival acceptance. Is it mandatory for me to register to the conference as I won't be able to attend(visa issues), but my friend will be there in the conference and can present on my behalf. I have few questions…
9 -
r/MachineLearning community 1mo ago
Why do the output layer weights become word vectors in Word2Vec? [D]
I'm trying to understand the intuition behind Word2Vec training using a neural network. In Word2Vec (CBOW or Skip-gram), we often hear that the weight matrices learned during training contain the vector representations (embeddings) of words. However, I don't understand why the…
31 -
r/MachineLearning community 1mo ago
Requesting reduction in reviewer load for NeuRIPS? [D]
I didn't submit any but did place bids on some papers. I got assigned four papers. I have a bit of travel coming up and I don't think I will be able to do justice to as many the papers, especially in the rebuttal period. Is this the standard reviewing load? In other communities…
25 -
r/MachineLearning community 1mo ago
Event like spiking neuron lib that fits into the CPU cache [P]
I benchmarked it against PyTorch with a Wikipedia dataset. I heavily used Gemini Flash 3.5 to build out my vision https://huggingface.co/etoxin/neuronguard-wikipedia-classifier   submitted by   /u/Logical_Prompt_3543 [link]   [comments]
21 -
r/MachineLearning community 1mo ago
Graduating Without a PhD Internship [D]
In early 2022, I was deciding between PhD offers. The deal maker was a prospective supervisor telling me that through their connections with big tech, I would be able to do a PhD internship each summer, which was one of my main goals for the PhD. During my first and second…
37 -
r/MachineLearning community 1mo ago
ICML paper checker is down? [D]
I was getting ready to upload my camera-ready paper to ICML (few minutes before the deadline... no comments), but the paper checker site seemingly went down before I could finish... I emailed the publication chairs already but i just wanted to know if anyone else was in the same…
6 -
r/MachineLearning community 1mo ago
Hopfield Memory in VLA [R]
I am currently doing a research internship (2 months) in VLA and I have come across the Hopfield network based on the paper Hopfield Networks is All You Need and seeing the potential advantages of using this as a memory module over the transformer architecture based HAMLET…
18 -
r/MachineLearning community 1mo ago
Social Simulation with LLMs - Fidelity in Applications (CFP @ COLM'26) [R]
🌟 Announcing the 2nd Workshop on Social Simulation with LLMs (Social Sim'26) @ COLM 📣 Welcoming Submissions! Submission here:. 🗓️ Deadline: June 23, 2026 (AoE) This year's theme is "Fidelity in Applications”, moving beyond compelling demos toward evaluation, robustness,…
11 -
-
-
r/MachineLearning community 1mo ago
ACM MM 2026 review discussion [D]
The AC email says the rebuttal is between 28 to 4th. The June 4th on website is the deadline. So I created this post for the discussion. I know it's a MM conference and less about ML but I think many people here are still submitting there.   submitted by  …
32 -
r/MachineLearning community 1mo ago
Training GPT-like model on non-language series [R]
I am responsible for a research project that is supposed to train a GPT-like model (Transformer-decoder) with 100M, 250M and 500M model variants. # params ## training dataset - 750M tokens - vocabulary is ~15k to ~100k tokens (depends on tokenizer settings) - ~3% of the…
29 -
r/MachineLearning community 1mo ago
Diffusion models for sketch-guided trajectory simulation [R]
Blog post: https://wezteoh.github.io/posts/diffusion-for-sketch-guided-trajectory-simulation/ During NBA games, coaches often sketch attacking plays on a whiteboard and mentally simulate how teammates and defenders might react. In this project, I explored using diffusion models…
30 -
r/MachineLearning community 1mo ago
STEM PhD's transitioning to MLE/Data [R]
I'm hoping for some advice from any former PhD's outside of machine learning. If you made it into machine learning engineering and/or data science, what was the key for you? Any tips for this job market? It seems like non computer science PhD's are especially in trouble at the…
38 -
r/MachineLearning community 1mo ago
Should I attend ICML as a junior? [D]
I am a junior in college, and have two accepted workshop papers at ICML 2026. Some background: I had an accepted workshop paper last year at ICLR, but couldn't attend due to a rejected visa, which led to all the more disappointment. So this year I was VERY eager to attend, and…
4 -
-
r/MachineLearning community 1mo ago
[R] What 1000+ Harness Experiments Taught Me About Self-Improving Agents [R]
I recently wanted to see whether an AI agent could self-improve a harness to solve terminal bench tasks. It’s possible for an AI agent to propose a meaningful one-time change to the harness, but after experimenting with this for a couple of weeks, I think the continuous…
35 -
r/MachineLearning community 1mo ago
AI-generated CUDA kernels silently break training and inference [R]
Last month NVIDIA released SOL-ExecBench , a new benchmark of 235 production CUDA kernels lifted from DeepSeek, Qwen, Gemma, and Kimi. We took several top-ranked AI-generated submissions and tried using them in production workloads. Many of them broke, sometimes in surprising…
14 -
r/MachineLearning community 1mo ago
Best Text to Text Translation Model? [D]
I'm working on a project that translates any language into English. So far, I've tried NMT models like NLLB, MADLAD, and SeamlessM4T v2. The main issue is that they struggle with proper nouns such as: - names - places - dates - organizations I also tried LLMs like Gemma 4, Qwen…
22