r/LocalLLaMA · May 30, 2026 · 1 min read

STT -> LLM -> TTS pipeline

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Hey guys, I’m trying to learn about how to better create a STT LLM TTS pipeline.

My current setup is running a 3090 on Ubuntu. I use llama.cpp to run Qwen 3.6 27B Q4 with pi-agent for tool calling, and I just run everything in the terminal, I haven’t really bothered with chat style front ends.

I’m trying to figure out how the actual pipeline goes when using 3 models to process information like that. I understand how to run a single model obviously, but as someone who isn’t a trained coder, I don’t really understand what sort of framework is used to pipe information from the STT model to the LLM, and back out to the TTS model. Am I running three llama.cpp instances?

Need some guidance. Thanks!

submitted by /u/UniqueIdentifier00
[link] [comments]

Discussion (0)

No comments yet. Sign in and be the first to say something.

Discussion (0)

More from r/LocalLLaMA