r/LocalLLaMA · · 1 min read

Gemma 4 WebGPU Kernels 255 tok/s by x/@xenovacom

Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.

Gemma 4 WebGPU Kernels 255 tok/s by x/@xenovacom

We need more of this, 100+ T/s on dense models is the difference between defaulting to Claude/Codex for everything vs having a local private model doing most of the heavy lifting and only reaching for frontier for heavy intelligence work.
https://x.com/xenovacom/status/2065656427117437213

submitted by /u/yonz-
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/LocalLLaMA