I built a local LLM NPC backend focused on NPC-to-NPC conversations
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I just released a research project I did last year as open source. It is a fully local speech-to-speech backend for LLM NPCs. So speech-to-text, local LLM, text-to-speech, no cloud needed. The main focus was NPCs talking to each other, not just answering the player, and my study looked at how players experience witnessing those NPC-to-NPC conversations and what it does for immersion. The NPCs can talk to each other, remember what they said, and later use that context when the player talks to them. There is also a background Game Manager AI that can inject hidden behavioral notes into NPCs to steer the story a bit. Latency was one of the main technical challenges. With Llama 3.2 3B for VR and 7B on a 4070 Ti I was getting around 400 to 600ms Time to First Audio (TTFA), which is roughly where it starts feeling like a real conversation instead of waiting for the NPC to think. It also runs alongside the Unity scene, which you can see in the demo. For multiple NPCs, I used a shared generation lock so the GPU does not get overloaded. Each NPC has its own LLM context/personality and TTS setup, but only one generates at a time. They take turns, and the switch between characters is basically instant, so it feels natural. The limitation is that two NPCs cannot literally speak over each other at the exact same moment. It is WebSocket based, so it should work with Unity, Unreal, or anything else that can talk over WebSockets. I also included the Unity scripts. I would really like people to try it, build on it, or give feedback. To adapt it to your own game, the main work is tuning the 3-layer NPC prompt setup and the Game Manager prompt. That takes a bit of work, but it is very doable with AI help, and I think a lot of it could be automated later. Demo video, detective game in Unity: [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.