It looks like Rio 3.5 397B could've simply been a semi-failed embezzling of funding
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Here is the chain of events:
- The model training received funding of R$500K (about $100K USD).
- The initial model documentation claimed that it was a developed on top of Qwen 3.5 397B with fancy training and great improvements.
- It was discovered that the model was a cheap, simple merge with Nex N2 Pro without any further training.
- The model readme was updated to admit that it was based on a Nex N2 Pro merge, while still insisting that additional training still took place, and they simply uploaded the wrong model. The previously uploaded model was removed from HF.
- They tweeted (among something that looks like an attempt at damage control) that the final trained model got lost, so they'll have to redo it from scratch.
This reads to me like "we pocketed the funding, delivered a fake result, got caught, and now promise to do the actual work to mitigate impact on us".
[link] [comments]
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
A cheap trick for reliable structured output: feed the validation error back into the retry
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.