Vision

OCR for vision-capable models: four sub-tasks — handwritten meeting notes in three difficulty tiers (easy / medium / hard) plus an old book page set in Fraktur typeface.

Task & test logic in detail

Task: Four OCR sub-tasks, one image each. (1)–(3) Handwritten meeting notes in three difficulty tiers (easy / medium / hard) — the model must transcribe the text. (4) An old book page in Fraktur typeface — same task. What is tested: OCR quality, recognising layout structure (columns, bullet points, dates), handling of illegible handwriting and historical letterforms (long-s, ligatures). Why models fail: text-only models have no vision capability (filtered out). Weak VLMs only recognise the clearest part. Some truncate output or get stuck in reasoning without producing a visible answer.

Prompt

System prompt

Du bist OCR-Spezialist für deutsche Handschrift.

Developer prompt

Auf dem Bild siehst du eine handschriftliche Meeting-Notiz mit klarer Struktur und gut lesbarer Schrift. Transkribiere den gesamten lesbaren Text. Behalte die Anordnung bei (Überschrift, Spalten, To-Dos). Bei unleserlichen Stellen schreibe '[unleserlich]'. Gib ausschließlich den puren OCR-Text im Markdown-Format zurück — keine Vorbemerkung, keine Erklärungen, kein Code-Fence.

Wall-time vs. quality

Max RAM

X = wall-time for this bench · Y = score (0–100 %) in this bench. Optimum is top-left — fast and good. RAM estimate for 64k context: 4 GB system + model weights + max(2 GB, 40% of weights) for KV cache.

Colour = vendor · Number = total parameters (B) dense MoE

Models in this bench

28 visible

1. qwen3.6-27b gguf 4bit 98% · 281s · 22 t/s · 27 GB
2. qwen3.6-35b-a3b gguf 8bit 97% · 116s · 68 t/s · 53 GB
3. qwen3.5-122b-a10b gguf 4bit 97% · 715s · 10 t/s · 102 GB
4. qwen3.5-9b gguf 8bit 97% · 292s · 44 t/s · 18 GB
5. qwen3.5-9b gguf 4bit 96% · 315s · 59 t/s · 13 GB
6. qwen3.5-2b gguf 4bit 93% · 15s · 162 t/s · 8 GB
7. gemma-4-31b gguf 4bit 93% · 370s · 21 t/s · 30 GB
8. ministral-3-14b-reasoning gguf 4bit 92% · 46s · 47 t/s · 16 GB
9. qwen3-vl-30b mlx 4bit 92% · 33s · 82 t/s · 28 GB
10. qwen3-vl-8b mlx 4bit 92% · 37s · 79 t/s · 12 GB
11. glm-4.6v-flash mlx 4bit 91% · 82s · 64 t/s · 13 GB
12. devstral-small-2-2512 mlx 4bit 89% · 85s · 33 t/s · 22 GB
13. gemma-3-27b mlx 4bit 86% · 68s · 28 t/s · 26 GB
14. gemma-4-e4b gguf 8bit 85% · 79s · 67 t/s · 16 GB
15. gemma-4-e4b gguf 4bit 83% · 58s · 87 t/s · 12 GB
16. qwen3.5-35b-a3b gguf 4bit 78% · 122s · 79 t/s · 33 GB
17. gemma-3-12b mlx 4bit 77% · 31s · 56 t/s · 15 GB
18. qwen3.6-35b-a3b gguf 4bit 75% · 139s · 82 t/s · 33 GB
19. gemma-4-26b-a4b gguf 8bit 74% · 234s · 72 t/s · 41 GB
20. gemma-3n-e4b mlx 4bit 72% · 17s · 80 t/s · 12 GB
21. qwen3.5-9b-mlx mlx 4bit 71% · 179s · 85 t/s · 12 GB
22. gemma-4-26b-a4b gguf 4bit 71% · 197s · 89 t/s · 27 GB
23. qwen3.5-4b gguf 4bit 70% · 176s · 85 t/s · 9 GB
24. gemma-3-4b mlx 4bit 65% · 12s · 141 t/s · 9 GB
25. gemma-4-e2b gguf 8bit 63% · 34s · 111 t/s · 12 GB
26. gemma-4-e2b gguf 4bit 55% · 20s · 138 t/s · 10 GB
27. nemotron-3-nano-omni gguf 8bit 28% · 192s · 77 t/s · 50 GB
28. nemotron-3-nano-omni gguf 4bit 27% · 156s · 86 t/s · 38 GB

Model	Vendor	Quant	Ctx	Released	RAM	tok/s	Tokens	Wall	Score
qwen3.6-27b	qwen	gguf 4bit	256k	2026-04-21	16.3 GB	22	5301	281.0 s	98%
qwen3.6-35b-a3b	qwen	gguf 8bit	256k	2026-04-15	35.2 GB	68	7184	116.2 s	97%
qwen3.5-122b-a10b	lmstudio-community	gguf 4bit	256k	2026-02-24	70.0 GB	10	8842	715.4 s	97%
qwen3.5-9b	qwen	gguf 8bit	256k	2026-02-27	9.7 GB	44	12243	291.7 s	97%
qwen3.5-9b	qwen	gguf 4bit	256k	2026-02-27	6.1 GB	59	17594	315.4 s	96%
qwen3.5-2b	lmstudio-community	gguf 4bit	256k	2026-03-02	1.8 GB	162	1488	14.9 s	93%
gemma-4-31b	google	gguf 4bit	256k	2026-03-12	18.5 GB	21	7477	370.5 s	93%
ministral-3-14b-reasoning	mistralai	gguf 4bit	256k	2025-10-31	8.5 GB	47	1570	46.3 s	92%
qwen3-vl-30b	qwen	mlx 4bit	256k	2025-10-04	17.0 GB	82	1610	33.0 s	92%
qwen3-vl-8b	qwen	mlx 4bit	256k	2025-10-11	5.4 GB	79	1552	37.4 s	92%
glm-4.6v-flash	zai-org	mlx 4bit	128k	2025-12-07	6.6 GB	64	3559	82.2 s	91%
devstral-small-2-2512	mistralai	mlx 4bit	384k	2025-12-09	13.2 GB	33	1549	85.5 s	89%
gemma-3-27b	google	mlx 4bit	128k	2025-03-01	15.7 GB	28	1566	68.3 s	86%
gemma-4-e4b	google	gguf 8bit	128k	2026-03-02	8.4 GB	67	5172	79.2 s	85%
gemma-4-e4b	google	gguf 4bit	128k	2026-03-02	5.9 GB	87	4829	58.0 s	83%
qwen3.5-35b-a3b	qwen	gguf 4bit	256k	2026-02-24	20.6 GB	79	8642	122.4 s	78%
gemma-3-12b	google	mlx 4bit	128k	2025-03-01	7.5 GB	56	1378	31.5 s	77%
qwen3.6-35b-a3b	qwen	gguf 4bit	256k	2026-04-15	20.6 GB	82	10369	139.0 s	75%
gemma-4-26b-a4b	google	gguf 8bit	256k	2026-03-12	26.1 GB	72	16643	234.3 s	74%
gemma-3n-e4b	google	mlx 4bit	32k	2025-06-03	5.5 GB	80	1186	17.2 s	72%
qwen3.5-9b-mlx	mlx-community	mlx 4bit	256k	2026-02-27	5.6 GB	85	14192	178.7 s	71%
gemma-4-26b-a4b	google	gguf 4bit	256k	2026-03-12	16.8 GB	89	17214	197.0 s	71%
qwen3.5-4b	lmstudio-community	gguf 4bit	256k	2026-03-02	3.2 GB	85	13686	176.1 s	70%
gemma-3-4b	google	mlx 4bit	128k	2025-02-20	2.8 GB	141	1061	11.7 s	65%
gemma-4-e2b	google	gguf 8bit	128k	2026-03-02	5.5 GB	111	3574	34.0 s	63%
gemma-4-e2b	google	gguf 4bit	128k	2026-03-02	4.1 GB	138	2509	20.0 s	55%
nemotron-3-nano-omni	nvidia	gguf 8bit	256k	2026-04-20	32.8 GB	77	14548	192.4 s	28%
nemotron-3-nano-omni	nvidia	gguf 4bit	256k	2026-04-20	24.3 GB	86	13175	156.3 s	27%
gemma-4-31b	google	gguf 8bit	256k	2026-03-12	—	0	—	0.0 s	0%
qwen3.6-27b	qwen	gguf 8bit	256k	2026-04-21	—	0	—	0.0 s	0%

Click a row to open the model detail page. Hover shows available render previews. Column headers are sortable.