Where are we with computer-control harnesses?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
Seems like local vision language models models are getting smart enough so that it would be useful to hand them the cursor in a secure sandbox. What harnesses are available that can do this?
edit: oh my fucking God something about this post triggered all of the bots to come out and post their sloppy LinkedIn style bullshit. Fuck off.
[link] [comments]
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.