Watch local LLMs escape the rooms you design
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| Hello! I'd like to share my repo for WATCH MY ESCAPE: https://github.com/cjami/watch-my-escape It's an inverted escape room game where you design the maps and LLMs have to try to escape them. It uses traditional action verbs (e.g. push, pull, pick-up) to interact with the visible environment, just like classic adventure games. There are currently 5 model presets (downloads when running an escape with them):
All are at Q4_K_M so should fit in about 8GB of VRAM. Tested on a 4090, 3070 and a M1. You can easily configure it for any model on HF by changing values in the config file: https://github.com/cjami/watch-my-escape/blob/main/src/watch_my_escape/llm/config.py It features a fully kitted map editor as well so you can create whatever you want and test models on them. It is completely font-based so you can use whatever emojis are available to represent objects. Also supports import/export via JSON. The main technique used here is splitting the agent's action into two steps: 'Think then Act' - having a free reasoning step followed by a grammar constrained action step via llama.cpp. This allows us to use small models reliably within a game environment with structured output. Note: they are not spatially reasoning, but just moving from one visible object to another (would overwhelm small models otherwise). Quick setup (need uv and node.js installed): It should then auto-detect and install the appropriate llama-cpp-python wheel for your hardware (metal, cuda, vulkan, cpu or rocm via override) during setup. This was created over a week for the 'Build Small' hackathon by Hugging Face x Gradio. Use it to try out different LLMs or make your own personal benchmarks! Hopefully this also provides a glimpse into how LLMs can be used in future games :) [link] [comments] |
More from r/LocalLLaMA
-
Palantir CEO rages against closed models
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
-
[Benchmark] Kimi K2.7 Code Q3 on Mac Studio M3 Ultra + RTX PRO 6000 over llama.cpp RPC: prefill improves, no changes in token generation/decode
Jul 2
-
They fit! Mostly.... 2x 3090, Thermaltake Core p3
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.