21 GPU's benchmarked running a small TTS model (vram peak: 5GB)
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| I rented different GPUs on vast.ai for a few minutes each to benchmark a small TTS model, OmniVoice, with a peak VRAM usage of about 5 GB. I wanted to see how various mostly consumer GPUs would stack up against my own RTX 3090. This is by no means an extensive or scientific analysis, but I think it gives a rough estimate of how these GPUs perform relative to each other. xRT means times real-time. It shows how much faster than real-time the GPU generates audio. Average of 3 runs of a small paragraph with reference audio provided (voice cloning). [link] [comments] |
More from r/LocalLLaMA
-
Local benchmarks with a RTX 3090 - Qwen3.6 27b vs Ornith
Jul 2
-
It's officially over. One of the fathers of AI at Nvidia doesn't believe in AGI and compares OpenAI and Anthropic's closed models to AOL and Prodigy's closed internets. Says the future is every business having a customized open source model.
Jul 2
-
6x P40 running Minimax M2.7_Q3_XL
Jul 2
-
Fine-tuned Gemma-4-31B specifically for Copywriting & Creative Writing Tasks (Scored +290 Elo over base using EqBench3)
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.