TurboOCR v3 — high-speed document OCR server (C++/CUDA), ~520 img/s on RTX 5090
Mirrored from r/LocalLLaMA for archival readability. Support the source by reading on the original site.
TurboOCR is a self-hosted, high-speed document OCR server, runs fully local. Here's What's New in v3:
Speed:
- Full pipeline now on the newest PP-OCRv6 models (up from v5): ~270 → ~520 img/s on FUNSD (v6 tiny, RTX 5090).
- Still fully local, HTTP + gRPC.
Structured parsing (the main addition):
- End-to-end now: layout → tables to HTML → formulas to LaTeX → reading-order Markdown.
- Tables and formulas are strict per-request opt-in, so you only pay the cost when you actually need them.
Stack: C++, TensorRT FP16, multi-stream, gRPC/HTTP, direct PDF endpoint, PP-OCRv6.
[link] [comments]
More from r/LocalLLaMA
-
What's in your RAG?
Jul 2
-
Palantir CEO rages against closed models
Jul 2
-
A cheap trick for reliable structured output: feed the validation error back into the retry
Jul 2
-
SenseNova-U1-8b-MoT-Infographic-V2 (released yesterday) - An open source SOTA beast for infographic design and image editing.
Jul 2
Discussion (0)
Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.
Sign in →No comments yet. Sign in and be the first to say something.