GOT-OCR 2.0
580M-parameter end-to-end OCR-2.0 model: a vision encoder paired with a Qwen-based decoder, trained specifically for general OCR including math formulas (LaTeX out), tables (Markdown/HTML out), sheet music, geometric shapes, and dense multi-column documents.
The open-source answer for formula and table OCR — beats Nougat decisively and runs on a potato. Standard transformers integration is rough; commit to the custom inference path before adopting.
Overview
580M-parameter end-to-end OCR-2.0 model: a vision encoder paired with a Qwen-based decoder, trained specifically for general OCR including math formulas (LaTeX out), tables (Markdown/HTML out), sheet music, geometric shapes, and dense multi-column documents.
Strengths
- End-to-end formula OCR — outputs LaTeX directly, beats Nougat and most pipelines
- Table OCR straight to Markdown/HTML, preserving structure
- Apache-2.0, fully commercial-friendly
- Only 580M params — sub-2GB VRAM at FP16, viable on edge/CPU
- Multilingual including CJK; supports interactive region/color-based OCR
Weaknesses
- Requires custom inference code (trust-remote-code) — not a vanilla transformers pipeline
- Single-purpose: only OCR, no captioning, VQA, or chat
- Quality on noisy phone-camera photos lags commercial OCR (Azure Document Intelligence, Textract)
- Limited fine-tuning recipes published — adapting to a new domain is non-obvious
Quantization variants
Each quantization trades model quality for file size and VRAM. Q4_K_M is the most popular starting point.
| Quantization | File size | VRAM required |
|---|---|---|
| Q4_K_M | 0.3 GB | 1 GB |
Get the model
HuggingFace
Original weights
Source repository — direct quantization required.
Hardware that runs this
Cards with enough VRAM for at least one quantization of GOT-OCR 2.0.
Models worth comparing
Same parameter band, plus what's one tier above and below — so you can decide what actually fits your hardware.
Frequently asked
What's the minimum VRAM to run GOT-OCR 2.0?
Can I use GOT-OCR 2.0 commercially?
What's the context length of GOT-OCR 2.0?
Source: huggingface.co/stepfun-ai/GOT-OCR2_0
Reviewed by RunLocalAI Editorial. See our editorial policy for how we research and verify model claims.
Related — keep moving
Verify GOT-OCR 2.0 runs on your specific hardware before committing money.