llama-launcher v1.3 release -> Bayesian Optimisation
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hello everyone, some of you may have seen a post of mine from a few days ago about my app, llama-launcher, a lightweight point-and-click GUI to create llama-server commands without the constant need for typing them up. Well, I've just added an optimisation feature that uses Tree-Structured Parzen estimation through optuna's framework. It uses llama-server to tune a pre-determined set of parameters to try to squeeze the last bit of juice out of your system, completely hands-free. I've been using this to get the last bit of performance from my MTP models without having to sit at my desk tuning, loading, prompting, and unloading manually and repeatedly. So far, I've seen upto a 15% improvement in speeds (as seen in the images) versus baseline commands with no tuning with Gemma 12B MTP during testing. Without any human interaction at all during the optimisation process. It's still in it's early stages so there are many improvements to be made but any suggestions you may have please let me know. You can check the repo out here: https://github.com/SolaryKryptic/llama-launcher [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.