Proposal: objective, sortable ranking columns
models.dev already exposes every fact needed to rank models objectively (cost, context window, output limit, capability flags, modality breadth, release date) — but there's no way to rank the catalog as a whole; you can only sort one column at a time.
I'd like to add three transparently-computed, sortable score columns — Overall, Value, Capability — derived purely from those existing fields (no benchmarks, no hand-grading), plus a dynamic rank (#) column.
Why columns, not one ranking
Any single "best model" score is an opinion. Shipping it as sortable columns keeps the data neutral and lets each user pick the lens that fits their use case (cheap-yet-capable vs. raw feature breadth vs. balanced).
Honest limitation
The catalog has no quality/benchmark field, so this measures spec-breadth-per-dollar, not model intelligence. Broad/cheap/omni-modal models (and auto-routers) rank highest — correct given the inputs, but worth stating plainly. If an objective quality signal ever lands in the schema, it drops straight into the existing blend.
Implementation
Web-only; canonical api.json is unchanged. Weights live in one documented place so they're easy to audit or tune.
PR: #1892
Open to trimming the column count, making weights configurable, or holding this if a built-in ranking isn't a direction you want. Feedback welcome.
Proposal: objective, sortable ranking columns
models.dev already exposes every fact needed to rank models objectively (cost, context window, output limit, capability flags, modality breadth, release date) — but there's no way to rank the catalog as a whole; you can only sort one column at a time.
I'd like to add three transparently-computed, sortable score columns — Overall, Value, Capability — derived purely from those existing fields (no benchmarks, no hand-grading), plus a dynamic rank (#) column.
Why columns, not one ranking
Any single "best model" score is an opinion. Shipping it as sortable columns keeps the data neutral and lets each user pick the lens that fits their use case (cheap-yet-capable vs. raw feature breadth vs. balanced).
Honest limitation
The catalog has no quality/benchmark field, so this measures spec-breadth-per-dollar, not model intelligence. Broad/cheap/omni-modal models (and auto-routers) rank highest — correct given the inputs, but worth stating plainly. If an objective quality signal ever lands in the schema, it drops straight into the existing blend.
Implementation
Web-only; canonical
api.jsonis unchanged. Weights live in one documented place so they're easy to audit or tune.PR: #1892
Open to trimming the column count, making weights configurable, or holding this if a built-in ranking isn't a direction you want. Feedback welcome.