Uh.. Honey, how do you feel about takeout?
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
| - 2x RTX Pro 6000 Max-Q (96GB) - 3 PSUs 448GB VRAM ~30 tp/s per single stream Can get 1m context for one user, but ideally want 4x concurrency. TBD where context will land… or my marriage… [link] [comments] |
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.