leaderboard: short model labels, recall toggle for BEIR, sortable+searchable tables#27
Merged
Merged
Conversation
…e+searchable tables
Three follow-ups on the redesign:
1. Strip provider prefix from displayed model names (openai/gpt-4.1 →
gpt-4.1, Qwen/Qwen2.5-7B-Instruct → Qwen2.5-7B-Instruct). The
canonical id stays in the data (run paths, hashes, registry); only
the visual presentation changes. New `model_display` field on
matrix rows, per-X view rows, runs.json, models.json.
2. Fix the recall toggle on the home matrix for BEIR datasets. The
secondary metric is now derived from what's actually present in the
matrix data — recall_1000 for DL/DL-HARD, recall_100 for BEIR —
instead of being driven by the registry whitelist (which over-
specified). Toggling now shows real values for every column.
3. New InteractiveTable component wraps every matrix table with a
global search input and click-to-sort column headers. Numeric
columns get data-sort-value so 0.6952 sorts as a number, not a
string. Search filters by concatenated row text. Applied to home,
/datasets/{id}, /models/{id}, /methods/{id}, /retrievers/{id}.
The home page's existing retriever/model/metric chips coexist with
the InteractiveTable: chips set a qg-chip-hidden class on rows and
dispatch a qg-itable-reapply event, which the table script honours
when computing visibility + the shown-count badge.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three follow-ups on the redesigned leaderboard (PR #26).
1. Short model labels. Display strips the provider prefix everywhere it shows:
openai/gpt-4.1→gpt-4.1,Qwen/Qwen2.5-7B-Instruct→Qwen2.5-7B-Instruct. The canonical id stays in the data (run paths, hashes, registry); only the visual presentation changes. Newmodel_displayfield on matrix rows, per-X view rows,runs.json,models.json.2. Recall toggle works on BEIR. On the home matrix, toggling to recall now shows real values for every column. The secondary metric for each dataset is now derived from what's actually present in the matrix data (
recall_1000for DL/DL-HARD,recall_100for BEIR) instead of being driven by the registry whitelist. Fixes labels like "ArguAna / R@1k" — now correctlyArguAna / R@100.3. Sortable + searchable tables. New
InteractiveTable.astrocomponent wraps every matrix table with a global search input and click-to-sort column headers. Numeric columns carrydata-sort-valueso0.6952sorts numerically. Applied to home +/datasets/{id}+/models/{id}+/methods/{id}+/retrievers/{id}.The home page's existing retriever/model/metric filter chips coexist with the InteractiveTable via a class-and-event handshake: chips set
qg-chip-hiddenon rows and dispatchqg-itable-reapply, the table script respects that as a hard veto over its own search.Test Plan
pnpm -F @qg/leaderboard build— clean, 1113 pages.gpt-4.1/Qwen2.5-72B-Instruct, (b) toggling to Recall on the home matrix populates BEIR columns, (c) every matrix table has a search box + clickable sort headers.🤖 Generated with Claude Code