Vision
OCR for vision-capable models: four sub-tasks — handwritten meeting notes in three difficulty tiers (easy / medium / hard) plus an old book page set in Fraktur typeface.
Task & test logic in detail
Task: Four OCR sub-tasks, one image each.
(1)–(3) Handwritten meeting notes in three difficulty tiers (easy / medium / hard) — the model must transcribe the text.
(4) An old book page in Fraktur typeface — same task.
What is tested: OCR quality, recognising layout structure (columns, bullet points, dates), handling of illegible handwriting and historical letterforms (long-s, ligatures).
Why models fail: text-only models have no vision capability (filtered out). Weak VLMs only recognise the clearest part. Some truncate output or get stuck in reasoning without producing a visible answer.
Prompt
System prompt
Du bist OCR-Spezialist für deutsche Handschrift.
Developer prompt
Auf dem Bild siehst du eine handschriftliche Meeting-Notiz mit klarer Struktur und gut lesbarer Schrift. Transkribiere den gesamten lesbaren Text. Behalte die Anordnung bei (Überschrift, Spalten, To-Dos). Bei unleserlichen Stellen schreibe '[unleserlich]'. Gib ausschließlich den puren OCR-Text im Markdown-Format zurück — keine Vorbemerkung, keine Erklärungen, kein Code-Fence.
Wall-time vs. quality
X = wall-time for this bench · Y = score (0–100 %) in this bench. Optimum is top-left — fast and good. RAM estimate for 64k context: 4 GB system + model weights + max(2 GB, 40% of weights) for KV cache.
Colour = vendor · Number = total parameters (B) dense MoE
Models in this bench
26 visible
- 1. qwen3.6-27b gguf 4bit 98% · 281s · 22 t/s · 27 GB
- 2. qwen3.6-35b-a3b gguf 8bit 97% · 116s · 68 t/s · 53 GB
- 3. qwen3.5-122b-a10b gguf 4bit 97% · 715s · 10 t/s · 102 GB
- 4. qwen3.5-9b gguf 8bit 97% · 292s · 44 t/s · 18 GB
- 5. qwen3.5-9b gguf 4bit 96% · 315s · 59 t/s · 13 GB
- 6. qwen3.5-2b gguf 4bit 93% · 15s · 162 t/s · 8 GB
- 7. gemma-4-31b gguf 4bit 93% · 370s · 21 t/s · 30 GB
- 8. ministral-3-14b-reasoning gguf 4bit 92% · 46s · 47 t/s · 16 GB
- 9. qwen3-vl-8b mlx 4bit 92% · 37s · 79 t/s · 12 GB
- 10. glm-4.6v-flash mlx 4bit 91% · 82s · 64 t/s · 13 GB
- 11. gemma-3-27b mlx 4bit 86% · 68s · 28 t/s · 26 GB
- 12. gemma-4-e4b gguf 8bit 85% · 79s · 67 t/s · 16 GB
- 13. gemma-4-e4b gguf 4bit 83% · 58s · 87 t/s · 12 GB
- 14. qwen3.5-35b-a3b gguf 4bit 78% · 122s · 79 t/s · 33 GB
- 15. gemma-3-12b mlx 4bit 77% · 31s · 56 t/s · 15 GB
- 16. qwen3.6-35b-a3b gguf 4bit 75% · 139s · 82 t/s · 33 GB
- 17. gemma-4-26b-a4b gguf 8bit 74% · 234s · 72 t/s · 41 GB
- 18. gemma-3n-e4b mlx 4bit 72% · 17s · 80 t/s · 12 GB
- 19. qwen3.5-9b-mlx mlx 4bit 71% · 179s · 85 t/s · 12 GB
- 20. gemma-4-26b-a4b gguf 4bit 71% · 197s · 89 t/s · 27 GB
- 21. qwen3.5-4b gguf 4bit 70% · 176s · 85 t/s · 9 GB
- 22. gemma-3-4b mlx 4bit 65% · 12s · 141 t/s · 9 GB
- 23. gemma-4-e2b gguf 8bit 63% · 34s · 111 t/s · 12 GB
- 24. gemma-4-e2b gguf 4bit 55% · 20s · 138 t/s · 10 GB
- 25. nemotron-3-nano-omni gguf 8bit 28% · 192s · 77 t/s · 50 GB
- 26. nemotron-3-nano-omni gguf 4bit 27% · 156s · 86 t/s · 38 GB
| Model | Vendor | Quant | Ctx | Released | RAM | tok/s | Tokens | Wall | Score |
|---|---|---|---|---|---|---|---|---|---|
| qwen3.6-27b | qwen | gguf 4bit | 256k | 2026-04-21 | 16.3 GB | 22 | 5301 | 281.0 s | 98% |
| qwen3.6-35b-a3b | qwen | gguf 8bit | 256k | 2026-04-15 | 35.2 GB | 68 | 7184 | 116.2 s | 97% |
| qwen3.5-122b-a10b | lmstudio-community | gguf 4bit | 256k | 2026-02-24 | 70.0 GB | 10 | 8842 | 715.4 s | 97% |
| qwen3.5-9b | qwen | gguf 8bit | 256k | 2026-02-27 | 9.7 GB | 44 | 12243 | 291.7 s | 97% |
| qwen3.5-9b | qwen | gguf 4bit | 256k | 2026-02-27 | 6.1 GB | 59 | 17594 | 315.4 s | 96% |
| qwen3.5-2b | lmstudio-community | gguf 4bit | 256k | 2026-03-02 | 1.8 GB | 162 | 1488 | 14.9 s | 93% |
| gemma-4-31b | gguf 4bit | 256k | 2026-03-12 | 18.5 GB | 21 | 7477 | 370.5 s | 93% | |
| ministral-3-14b-reasoning | mistralai | gguf 4bit | 256k | 2025-10-31 | 8.5 GB | 47 | 1570 | 46.3 s | 92% |
| qwen3-vl-8b | qwen | mlx 4bit | 256k | 2025-10-11 | 5.4 GB | 79 | 1552 | 37.4 s | 92% |
| glm-4.6v-flash | zai-org | mlx 4bit | 128k | 2025-12-07 | 6.6 GB | 64 | 3559 | 82.2 s | 91% |
| gemma-3-27b | mlx 4bit | 128k | 2025-03-01 | 15.7 GB | 28 | 1566 | 68.3 s | 86% | |
| gemma-4-e4b | gguf 8bit | 128k | 2026-03-02 | 8.4 GB | 67 | 5172 | 79.2 s | 85% | |
| gemma-4-e4b | gguf 4bit | 128k | 2026-03-02 | 5.9 GB | 87 | 4829 | 58.0 s | 83% | |
| qwen3.5-35b-a3b | qwen | gguf 4bit | 256k | 2026-02-24 | 20.6 GB | 79 | 8642 | 122.4 s | 78% |
| gemma-3-12b | mlx 4bit | 128k | 2025-03-01 | 7.5 GB | 56 | 1378 | 31.5 s | 77% | |
| qwen3.6-35b-a3b | qwen | gguf 4bit | 256k | 2026-04-15 | 20.6 GB | 82 | 10369 | 139.0 s | 75% |
| gemma-4-26b-a4b | gguf 8bit | 256k | 2026-03-12 | 26.1 GB | 72 | 16643 | 234.3 s | 74% | |
| gemma-3n-e4b | mlx 4bit | 32k | 2025-06-03 | 5.5 GB | 80 | 1186 | 17.2 s | 72% | |
| qwen3.5-9b-mlx | mlx-community | mlx 4bit | 256k | 2026-02-27 | 5.6 GB | 85 | 14192 | 178.7 s | 71% |
| gemma-4-26b-a4b | gguf 4bit | 256k | 2026-03-12 | 16.8 GB | 89 | 17214 | 197.0 s | 71% | |
| qwen3.5-4b | lmstudio-community | gguf 4bit | 256k | 2026-03-02 | 3.2 GB | 85 | 13686 | 176.1 s | 70% |
| gemma-3-4b | mlx 4bit | 128k | 2025-02-20 | 2.8 GB | 141 | 1061 | 11.7 s | 65% | |
| gemma-4-e2b | gguf 8bit | 128k | 2026-03-02 | 5.5 GB | 111 | 3574 | 34.0 s | 63% | |
| gemma-4-e2b | gguf 4bit | 128k | 2026-03-02 | 4.1 GB | 138 | 2509 | 20.0 s | 55% | |
| nemotron-3-nano-omni | nvidia | gguf 8bit | 256k | 2026-04-20 | 32.8 GB | 77 | 14548 | 192.4 s | 28% |
| nemotron-3-nano-omni | nvidia | gguf 4bit | 256k | 2026-04-20 | 24.3 GB | 86 | 13175 | 156.3 s | 27% |
| gemma-4-31b | gguf 8bit | 256k | 2026-03-12 | — | 0 | — | 0.0 s | 0% | |
| qwen3.6-27b | qwen | gguf 8bit | 256k | 2026-04-21 | — | 0 | — | 0.0 s | 0% |
Click a row to open the model detail page. Hover shows available render previews. Column headers are sortable.