Same model, same prompt, 4 different agents
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Setup: one self-hosted Qwen3.6-27B (Q4) on llama.cpp, identical prompt, identical hardware. The only variable is the agent scaffolding. Agents tested: pi, opencode, hermes, qwen code. Task: a single-file 2D canvas solar system with scripted orbits and gravity that acts only on user-launched comets. The exact prompt (note the explicit "build incrementally, your context window is small" instruction): Results: all 4 produced a working sim, but the code quality differs a lot: opencode, my pick. Cleanest architecture, pi, most correct. Coordinate-consistent, distance softening to avoid singularities, removes comets that hit the Sun, planet labels, and the only one with touch support. Less flashy, most robust. hermes, flashiest, but physically wrong. Only one with real elliptical orbits + a nice drag-vector arrow. But it computes planet gravity on comets at a different time step than it renders the planets, so comets pull toward where the planets aren't. Looks best, simulates worst. qwen code, most minimal. Shortest, runs, but crude: huge launch-velocity multiplier flings comets off instantly, no softening, no stars. Takeaway: with a fixed local model, the agent's scaffolding visibly changes the output (integration strategy, coordinate hygiene, edge-case handling). The prettiest demo (hermes) was the buggiest; the plain-looking one (pi) was the most correct; opencode hit the best balance of clean code + stable physics. Curious whether others get the same ranking on their own local setups. [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
A cheap trick for reliable structured output: feed the validation error back into the retry
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.