r/MachineLearning
500 articles archived · Visit source ↗ · RSS
-
-
r/MachineLearning community 1mo ago
Tested chunking + embeddings data from 3 production websites. [P]
Tiered + page-role-aware RAG retrieval results across 3 corpora with very different content density: Workspace Sources Chunks HIGH MEDIUM LOW REJECTED Intercom 188 941 96 200 541 104 HubSpot 251 1705 40 508 1153 4 KPMG 53 209 3 14 127 65 (HIGH = avg operational score 0.84,…
19 -
r/MachineLearning community 1mo ago
LLMs are just giant probability machines pretending to think [P]
It’s fascinating that simple mathematics between tokens can eventually become a machine that writes essays, code, poetry, and even reasoning. We usually think probability means uncertainty. But LLMs show something strange: If probability + context + mathematical matching are…
36 -
r/MachineLearning community 1mo ago
Anonymous Data Upload for Submission [D]
How do you upload data anonymously for a submission (ACL/EMNLP)? I have several models I need to upload for replication and was thinking HuggingFace, but HF offers download tracking on a paid plan. Does this violate the policy since there is the potential of tracking the…
18 -
r/MachineLearning community 1mo ago
Could ML be used to automate C-suite organizational duties? [D]
We often see worry from workers that ML techniques will either fully replace them, or jostle them violently economically such that their earnings and well-being are impacted. Concurrently, many tech companies resist unionization/"guild" efforts to protect the careers of…
11 -
r/MachineLearning community 1mo ago
Custom image encoder [P]
Hello, I would like to know whether building my own image encoder would be a good idea instead of using models like CLIP, SigLIP/SigLIP2, or DINO. My use case is video frame classification. My pipeline is the following: the client sends me a video stream, sampled at 1 frame per…
5 -
r/MachineLearning community 1mo ago
COLM 2026 ReviewsDiscussion [D]
Didn't see one so wanted to make one myself. Reviews are actually already out, curious what everyone thinks about the quality of the reviews? I've heard it's a mixed bag and apparently a concerning amount of AI generated reviews for some people.   submitted by  …
33 -
-
r/MachineLearning community 1mo ago
Live Human Detector on Outbound Phone Calls [R]
Goal To save humans wasting time sitting in Call Centre queues waiting to be answered To have tool listen in on the audio stream of a live call, post IVR Navigation - to determine whether the call has transitioned out of the queue and to a live person. Requirements The tool must…
20 -
r/MachineLearning community 1mo ago
Novel Problems in VLA [R]
I'm currently doing a research internship and my supervisor is constantly pushing me to have a novel idea, I've read about 15-20 papers about VLA and I think that most of the things are saturated, I thought about an equivariant VLA based on equivariant CNN which was published in…
21 -
r/MachineLearning community 1mo ago
Lisbon Machine Learning School (LxMLS 2026) [D]
Hi did anyone apply it, or attended it previously? How was the experience? I got the acceptance but no scholarship, is it worth going self sponsored?   submitted by   /u/Icy-Solid-4159 [link]   [comments]
21 -
r/MachineLearning community 1mo ago
Does this idea sound fun? [R]
It's about inference-time learning by inserting some experts specialized for updating sibling expert weights in MoE. All the components needed were already there, but no one tried it inside MoE, so I did a small PoC. It kinda worked. I'd love to hear what you think.…
33 -
r/MachineLearning community 1mo ago
Columbia Machine Learning Summer School (MLSS) 2026 [D]
I got into this CFE MLSS 2026 and would like to connect with people who also got into it or have been in previous cohorts! I am organizing a group chat for people who got into the program :DD https://cfe.columbia.edu/content/mlss   submitted by   /u/elucidativemind…
24 -
r/MachineLearning community 1mo ago
High E2E latency on fine-tuned Gemma 4 26B despite low TTFT [R]
Recently fine-tuned a Gemma 4 26B model, and I’m seeing surprisingly high end-to-end latency despite the effective inference footprint being much smaller (~4B-ish behavior during serving). Current setup: Model: Gemma 4 26B (fine-tuned) Engine: vLLM Quantization: FP8 Hardware:…
27 -
r/MachineLearning community 1mo ago
OpenAI claims a general-purpose reasoning model found a counterexample to Erdos's unit-distance bound [D]
OpenAI posted a math result today claiming that one of its general-purpose reasoning models found a construction disproving the conjectured n^{1+O(1/log log n)} upper bound in Erdős’s planar unit-distance problem. Announcement:…
31 -
r/MachineLearning community 1mo ago
LLMs and Emojis [D]
LLMs are trained on human data, so where does the tendency to add emojis come from? For example, when some models generate code explanations or even normal responses, they often add lots of emojis that people don’t really use that way in real life. My current guess (without…
33 -
r/MachineLearning community 1mo ago
How competitive are PhD admissions currently [D]
Hi, how hard is it currently to get a PhD position in machine Learning? Like what are the requirements to get to a decent mid tier program (= they publish regularly at respected journals and their work gets read my some people)? How is it in different regions e.g US, Europe,…
10 -
r/MachineLearning community 1mo ago
Should I accept a PhD offer in NeuroAI [D]
Hi everyone. I am recent CS grad and I have received a PhD offer from a school in states. However I am deeply confused if I should accept it or not. My hesitation comes from the interdisciplinary nature of the program. It will be jointly supervised by the two professors, one…
27 -
r/MachineLearning community 1mo ago
Splitting data by label for FAISS [P]
If I have a labeled dataset Is it possible to split my data by label where each chunk is the sentences of one label and then use this to be able to label more sentences. And is this even a good idea for data labeling where I search for this certain sentence and see what the…
25 -
r/MachineLearning community 1mo ago
Any tool to get accepted conference papers sorted by citation count? [D]
Ie given a conference (say with openreview data) eg “NeurIPS, 2025”, return the accepted papers based on number of citations according to standard paper search engine (eg google scholar) Seems to be a surprisingly difficult thing to find online.   submitted by  …
16 -
r/MachineLearning community 1mo ago
NOML-NOML: hierarchical TD3 + anchor policy for flight control [P]
I built a custom RL algorithm for continuous flight control and open-sourced it. Sharing here in case the structural ideas are useful for anyone doing continuous control where one action axis dominates. I've been training continuous control on a 6-DoF flight sim…
31 -
r/MachineLearning community 1mo ago
Machine Learning on Spherical Manifold [R]
Hi, I'm interested in geometric deep learning (due to Michael M. Bronstein's book and Maurice Weiler's PhD thesis), and in order not to write projects to nowhere, I decided to keep a technical blog. I started with a short note about machine learning on spherical manifolds, but…
34 -
r/MachineLearning community 1mo ago
Instructions for (ICML) workshop reviews [D]
Hi, I am being reviewer for an ICML workshop; however, there are no guidelines on the structure of the reviews (e.g. what are the criteria, what is the grade scale, etc.). Does anyone know whether ICML workshops have some "convention" regardings reviews? Or do we ought to use…
22 -
r/MachineLearning community 1mo ago
[ECCV 2026] No modified date next to reviews [D]
On Openreview, you can see modified date next to the review. This modified date should be recent (anything 12th May or newer) which means that reviewer gave a final justification and may have increased their score or kept the same score. In either case, it means they read the…
29 -
r/MachineLearning community 1mo ago
Comparing data annotation platforms [D]
Scale AI Highest quality in the industry. But no public pricing and every project requires a sales call. Onboarding takes weeks not days. In June 2025 Meta bought a 49% stake and hired Scale’s CEO as Meta’s Chief AI Officer. Several major customers quietly reduced engagements…
27 -
r/MachineLearning community 1mo ago
LxMLS 2026 decision [D]
Has anyone applied to Lxmls 2026? Did you get any update?   submitted by   /u/No_Cardiologist7609 [link]   [comments]
34 -
r/MachineLearning community 1mo ago
What do you think about Tabular Foundation Models [D]
I've seen TabPFN-3's recent results, and there is a lot of buzz about foundation models for tabular data (TabICL, TabPFN). The performance that those models achieve is really amazing. What makes me a little suspicious about them? They can analyze small datasets only, so a few MB…
7 -
r/MachineLearning community 1mo ago
All fundamental knowledge in ML Course by Andrew NG that I noted and create into a repo github [R]
https://preview.redd.it/mikhasjiq32h1.png?width=572&format=png&auto=webp&s=4c053200dbd9852bebf083550e2144b31579d497 https://preview.redd.it/bay5r3njq32h1.png?width=575&format=png&auto=webp&s=2823db3d6bc534ef00330528a200cba2aca1c5d3…
4 -
r/MachineLearning community 1mo ago
How does loss functions work in PINN? [D]
I am learning Physics informed neural network (PINN). I am playing with simple 1rst/2nd 1D ODEs and I am calculating the loss functions by adding the initial condition loss and Physics loss (e.g. Total loss = lambda1 (L1) * Physics_loss (PL) + lambda2 (L2) * IC_loss (IL)).…
30 -
r/MachineLearning community 1mo ago
How to get rejected by IEEE T-PAMI with 'Excellent' scores?[D]
Hello everyone. I am keeping my identity anonymous today to protect my professional career. I am a junior researcher in Computer Vision, and I am sharing this story because I have hit a devastating deadlock with IEEE T-PAMI and the IEEE Ethics Office. Our Situation:…
13