Skip to content

Write blog post: "There Is No Best OCR Model" #2

@davanstrien

Description

@davanstrien

Write up the key findings as a blog post / HF blog:

  • Rankings change completely by document type (BPL vs Britannica vs UFO)
  • Document type matters more than model size (0.9B beats 4B on some collections)
  • Judges agree on clusters but disagree on ordering within clusters
  • Tool lets anyone create their own per-collection leaderboard

Link to the viewer Space, Hub datasets, and this repo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions