Skip to content

leaderboard: short model labels, recall toggle for BEIR, sortable+searchable tables#27

Merged
radinhamidi merged 1 commit into
mainfrom
leaderboard/polish-tables
May 20, 2026
Merged

leaderboard: short model labels, recall toggle for BEIR, sortable+searchable tables#27
radinhamidi merged 1 commit into
mainfrom
leaderboard/polish-tables

Conversation

@radinhamidi
Copy link
Copy Markdown
Member

@radinhamidi radinhamidi commented May 20, 2026

Summary

Three follow-ups on the redesigned leaderboard (PR #26).

1. Short model labels. Display strips the provider prefix everywhere it shows: openai/gpt-4.1gpt-4.1, Qwen/Qwen2.5-7B-InstructQwen2.5-7B-Instruct. The canonical id stays in the data (run paths, hashes, registry); only the visual presentation changes. New model_display field on matrix rows, per-X view rows, runs.json, models.json.

2. Recall toggle works on BEIR. On the home matrix, toggling to recall now shows real values for every column. The secondary metric for each dataset is now derived from what's actually present in the matrix data (recall_1000 for DL/DL-HARD, recall_100 for BEIR) instead of being driven by the registry whitelist. Fixes labels like "ArguAna / R@1k" — now correctly ArguAna / R@100.

3. Sortable + searchable tables. New InteractiveTable.astro component wraps every matrix table with a global search input and click-to-sort column headers. Numeric columns carry data-sort-value so 0.6952 sorts numerically. Applied to home + /datasets/{id} + /models/{id} + /methods/{id} + /retrievers/{id}.

The home page's existing retriever/model/metric filter chips coexist with the InteractiveTable via a class-and-event handshake: chips set qg-chip-hidden on rows and dispatch qg-itable-reapply, the table script respects that as a hard veto over its own search.

Test Plan

  • pnpm -F @qg/leaderboard build — clean, 1113 pages.
  • After merge: CF Pages auto-deploys; verify (a) model column shows just gpt-4.1 / Qwen2.5-72B-Instruct, (b) toggling to Recall on the home matrix populates BEIR columns, (c) every matrix table has a search box + clickable sort headers.

🤖 Generated with Claude Code

…e+searchable tables

Three follow-ups on the redesign:

1. Strip provider prefix from displayed model names (openai/gpt-4.1 →
   gpt-4.1, Qwen/Qwen2.5-7B-Instruct → Qwen2.5-7B-Instruct). The
   canonical id stays in the data (run paths, hashes, registry); only
   the visual presentation changes. New `model_display` field on
   matrix rows, per-X view rows, runs.json, models.json.

2. Fix the recall toggle on the home matrix for BEIR datasets. The
   secondary metric is now derived from what's actually present in the
   matrix data — recall_1000 for DL/DL-HARD, recall_100 for BEIR —
   instead of being driven by the registry whitelist (which over-
   specified). Toggling now shows real values for every column.

3. New InteractiveTable component wraps every matrix table with a
   global search input and click-to-sort column headers. Numeric
   columns get data-sort-value so 0.6952 sorts as a number, not a
   string. Search filters by concatenated row text. Applied to home,
   /datasets/{id}, /models/{id}, /methods/{id}, /retrievers/{id}.

   The home page's existing retriever/model/metric chips coexist with
   the InteractiveTable: chips set a qg-chip-hidden class on rows and
   dispatch a qg-itable-reapply event, which the table script honours
   when computing visibility + the shown-count badge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@radinhamidi radinhamidi merged commit 120ecd9 into main May 20, 2026
2 checks passed
@radinhamidi radinhamidi deleted the leaderboard/polish-tables branch May 20, 2026 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant