Write blog post: "There Is No Best OCR Model"

Write up the key findings as a blog post / HF blog:

- Rankings change completely by document type (BPL vs Britannica vs UFO)
- Document type matters more than model size (0.9B beats 4B on some collections)
- Judges agree on clusters but disagree on ordering within clusters
- Tool lets anyone create their own per-collection leaderboard

Link to the viewer Space, Hub datasets, and this repo.