Refactor queryparse by snejus · Pull Request #6663 · beetbox/beets

snejus · 2026-05-21T21:30:03Z

Query Parsing Architecture Consolidation

What this PR does

This PR reshapes query parsing around a single model-level entrypoint and moves parsing/building responsibilities closer to the objects that own them.

At a high level, the change:

Replaces scattered query/sort helper functions with LibModel.parse_query(...).
Centralizes parsing into ModelQuery, QueryTerm, and SortTerm.
Moves field-specific query construction into query classes via FieldQuery.from_model(...).
Updates library and plugin call sites to consume the new API consistently.
Cleans up and modernizes the test suite around the new parsing model.

Why this matters

Before:

Query parsing logic was split across multiple helper functions.
Callers had to know which parsing helper to use.
Field normalization and query construction were partly model-owned, partly parser-owned.
Tests were spread across unrelated files and still used older unittest patterns.

After:

Parsing has one clear entrypoint: Item.parse_query(...) / Album.parse_query(...).
Query terms, sort terms, and final model query assembly each have distinct responsibilities.
Query classes now own more of their own normalization and construction behavior.
The parser tokenizes input once and builds both query and sort from the same token stream.
Tests better reflect the actual parsing contract.

Architecture changes

1. Single parsing entrypoint: `LibModel.parse_query(...)`

The biggest architectural change is that model classes now own parsing through LibModel.parse_query(...), which returns a ModelQuery object:

parsed = Item.parse_query('artist:foo year+')
query = parsed.query
sort = parsed.sort

This replaces older helper-style flows such as:

parse_query_string(...)
parse_query_parts(...)
query_from_strings(...)
sort_from_strings(...)
parse_sorted_query(...)

Result

Reviewers can now think about query parsing as:

input -> model.parse_query(...) -> ModelQuery(query, sort)

instead of multiple helper paths.

2. Parsing pipeline is now explicit

The new parsing flow is layered:

raw string / token list
    ->
`ModelQuery.parse(...)`
    ->
tokenization via `shlex`
    ->
`QueryTerm` parses search terms
`SortTerm` parses sort directives
    ->
`ModelQuery.get_query(...)`
`ModelQuery.get_sort(...)`
    ->
final executable `Query` + `Sort`

This makes responsibilities much clearer:

QueryTerm: parse one search term like 'year:1999' or '/tmp'
SortTerm: parse one sort directive like 'artist+'
ModelQuery: orchestrate the full query string

3. Query construction moved closer to query classes

A second major design improvement is that field-aware query creation now lives in query classes themselves via FieldQuery.from_model(...).

That method now handles concerns that used to be distributed elsewhere:

legacy field alias replacement
shared-table field qualification
path normalization
query-specific pattern normalization via normalize_pattern(...)

Result

Instead of models and parsers coordinating low-level query setup, the query class now owns how it should be built for a model.

That is a cleaner boundary:

parser decides "what query type"
query class decides "how to build itself correctly"

4. Query composition is simpler and flatter

Query.__or__ was added, and both AndQuery and OrQuery now flatten when combined.

This matters because the parser now builds final query trees through operator composition rather than repeatedly wrapping collection queries.

High-level impact

The resulting query objects are simpler:

fewer unnecessary one-element wrappers
more direct TrueQuery, NumericQuery, etc.
flatter AndQuery / OrQuery trees

This is why several tests changed from asserting on nested subqueries[0] shapes to asserting directly on the final query type.

5. Sort handling is centralized and model-driven

Sort parsing now follows the same pattern as query parsing:

SortTerm parses and validates sort tokens
sort resolution happens against model metadata
default sort resolution lives on LibModel.default_sort

Also, sort case sensitivity is now owned by FieldSort rather than threaded through parsing helpers.

Result

Sorting behavior is:

easier to reason about
less coupled to parser plumbing
more naturally configured at the sort layer

Library and plugin integration impact

This PR updates multiple call sites to use the new model API consistently.

Before

query, sort = parse_query_string(q, Item)

or

query, _ = Item.parse_query(q)

After

parsed = Item.parse_query(q)
query = parsed.query
sort = parsed.sort

This affects core library flow and plugin consumers such as:

advancedrewrite
convert
ihate
lyrics
smartplaylist
aura
bpd

Why this is good

All callers now depend on the same contract:

models parse queries
callers consume .query and .sort

That reduces API surface and keeps parsing behavior consistent across the application.

Data flow after this PR

User/config/plugin query string
    ->
`Item.parse_query(...)` / `Album.parse_query(...)`
    ->
`ModelQuery.parse(...)`
    ->
build `query` and `sort`
    ->
`Library.get_results(...)` / plugin matching logic

And inside parsing:

'foo bar, baz year+'
    ->
search terms: `foo`, `bar`, `baz`
sort terms: `year+`
    ->
(`foo` AND `bar`) OR `baz`
+
sort by `year`

Behavioral changes reviewers should notice

This is mostly a refactor, but there are a few meaningful parser-contract changes:

1. Simpler query object shapes

Empty and single-term parses now often return direct queries like TrueQuery or NumericQuery instead of wrapper collection queries.

2. Parsing happens once

ModelQuery.parse(...) tokenizes once and derives both query and sort from the same token stream.

3. Implicit path queries are normalized in parser/core flow

Path-like input such as '/tmp' becomes a path query earlier and more consistently.

4. Sort suffix precedence is clearer

Strings ending in '+' or '-' are treated as sort directives when valid.

5. Public helper surface is reduced

Legacy dbcore parsing re-exports/helpers are removed in favor of the model API.

Test architecture changes

The test portion of this PR is not just cleanup; it aligns the suite with the new architecture.

What changed

Query parsing tests were moved into dedicated test/dbcore/test_queryparse.py
shared test models/sorts/queries were extracted into beets.test.fixtures
older unittest.TestCase style tests were rewritten into grouped pytest classes
many repetitive cases were collapsed into @pytest.mark.parametrize(...)

Why this matters

The tests now mirror the actual parser design:

TestQueryTermParsing
TestQueryFromParts
TestSortFromParts
TestParseSortedQuery
TestModelQueryErrors

That makes the parser contract easier to review and safer to extend.

Reviewer guide

If reviewing for architecture, the most important files are:

beets/dbcore/queryparse.py
beets/dbcore/query.py
beets/library/models.py
beets/library/library.py
beets/test/fixtures.py
test/dbcore/test_queryparse.py

Suggested review order

beets/dbcore/queryparse.py
- new parsing model
beets/dbcore/query.py
- query construction and composition changes
beets/library/models.py
- new model entrypoint and default sort ownership
plugin call sites
- API adoption
tests
- expected behavior and edge cases

Net effect

This PR is a query parsing consolidation.

It moves the system from a helper-driven design to a model-centered parsing API with clearer ownership boundaries:

models own parsing entrypoints
parser objects own syntax handling
query classes own model-aware construction
sort classes own sort behavior
callers consume one consistent parsed result

The overall impact is a smaller public surface, clearer layering, and a test suite that better documents the intended behavior.

This removes ambiguity with Library._fetch

Move sort case-sensitivity config lookup into `FieldSort` as a cached class property instead of threading a `case_insensitive` flag through query parsing helpers. This centralizes sort behavior at the sort implementation boundary and simplifies parse/orchestration call paths while preserving configurable case-sensitive ordering.

github-actions · 2026-05-21T21:30:14Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

codecov · 2026-05-21T21:32:25Z

⚠️ Unsupported file format

Upload processing failed due to unsupported file format. Please review the parser error message:

Error parsing JUnit XML in /home/runner/work/beets/beets/.reports/pytest.xml at 1:88641
Caused by:

RuntimeError: Error converting computed name to ValidatedString
Caused by:
    string is too long</code></pre>


For more help, visit our troubleshooting guide.

Copilot

Pull request overview

PR move query parsing from many helper path into one model entrypoint (LibModel.parse_query), then build query+sort via ModelQuery / QueryTerm / SortTerm. PR also update core + plugins + tests to use new API.

Changes:

Replace scattered parse_query_* + dbcore.parse_sorted_query usage with Item/Album.parse_query(...) returning ModelQuery(query, sort).
Move field-aware query construction into FieldQuery.from_model(...) and flatten query composition via Query.__or__ + AndQuery/OrQuery flattening.
Restructure tests: new test/dbcore/test_queryparse.py and shared model fixtures in beets/test/fixtures.py.

grug note: big change, many place for bug hide. grug want fix few correctness holes before merge.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
test/test_library.py	Remove legacy parse-query unittest cases (moved/updated elsewhere).
test/test_dbcore.py	Update fixtures usage + switch tests from `_fetch` to `get_results`; remove old queryparse tests.
test/plugins/test_smartplaylist.py	Update smartplaylist tests to use `Item/Album.parse_query` and new sort expectations.
test/dbcore/test_sort.py	Update sort tests for new parsing + config-driven case sensitivity; clear cached classproperty cache.
test/dbcore/test_queryparse.py	New pytest coverage for `QueryTerm`, `ModelQuery`, and sort parsing behavior.
beetsplug/smartplaylist.py	Switch playlist query parsing to `model_cls.parse_query`; adjust typing.
beetsplug/playlist.py	Update typing alias for item query registration (`QueryByField`).
beetsplug/play.py	Use `Album.default_sort` instead of library helper.
beetsplug/lyrics.py	Use `Item.parse_query` for auto-ignore query parsing.
beetsplug/ihate.py	Use `model_cls.parse_query(...).query` for match checks.
beetsplug/convert.py	Use `Item.parse_query(...).query` for `no_convert` matching.
beetsplug/bpd/init.py	Switch to `FieldQuery.from_model` / `any_writable_media_field_query` new arg order.
beetsplug/aura.py	Switch field queries to `FieldQuery.from_model`.
beetsplug/advancedrewrite.py	Replace manual `shlex` + helper parsing with `Item.parse_query`.
beets/test/fixtures.py	New shared fixtures for dbcore query/sort/model tests.
beets/plugins.py	Update query-prefix typing and comprehension-based merge of plugin query prefixes.
beets/library/queries.py	Deprecate old query parsing helpers and forward to model parsing.
beets/library/models.py	Add `LibModel.parse_query` + `default_sort`; migrate helper query construction to `FieldQuery.from_model`.
beets/library/library.py	Make library fetching accept string/parts/query; route to `model_cls.parse_query` and `Database.get_results`.
beets/dbcore/types.py	Tighten `Type.query` typing to `type[FieldQuery]`.
beets/dbcore/sort.py	Rename `order_clause` -> `clause`; move case-sensitivity ownership into `FieldSort`.
beets/dbcore/queryparse.py	New parsing pipeline (`ModelQuery`, `QueryTerm`, `SortTerm`) built on `shlex` token stream.
beets/dbcore/query.py	Add `FieldQuery.from_model` + pattern normalization hooks; flatten `AndQuery/OrQuery` composition.
beets/dbcore/db.py	Rename internal fetch API to `get_results`; switch to `Sort.clause()`.
beets/dbcore/init.py	Stop re-exporting old queryparse helper functions.

- Introduce SortTerm to centralize sort-part validation and direction parsing. - Move model-specific sort resolution into SortTerm.get_sort, including smart artist field mapping. - Simplify sort_from_strings to build sorts from parsed terms.

- Replace parse_query_part with a QueryTerm parser that captures negate, field, prefix, and pattern in one pass. - Resolve query class selection through QueryTerm so prefix and field-based dispatch are handled consistently. - Move implicit path query normalization into dbcore parsing, simplifying library-level query parsing. - Tighten plugin query typing and update dbcore tests for path behavior and related fixture assertions.

- Replace default sort helpers with LibModel.default_sort as a shared, config-driven sort helper for Item and Album. - Route Library._fetch through get_results with parsed query sort, explicit sort, or model default_sort fallback.

- Add ModelQuery in dbcore.queryparse and expose LibModel.parse_query as the single parsing entrypoint for string and sequence query inputs. - Route library fetch, path format matching, and plugin query parsing through the model-level API, and deprecate beets.library.parse_query_string and parse_query_parts. - Update tests to cover invalid query parsing via ModelQuery and align smartplaylist sort assertions with parsed sort behavior.

- Move field alias handling, shared-table qualification, and path pattern normalization into FieldQuery.from_model with a query-specific normalize_pattern hook. - Update query parsing and plugin call sites to construct queries through query_cls.from_model and remove redundant model-level field query helpers. This keeps model entrypoints focused on orchestration and colocates transformation behavior with the query types that own it.

- Move per-term query building into QueryTerm.get_query and compose field queries with operator-based OR logic. - Add Query.__or__ plus flattening behavior for AndQuery and OrQuery to avoid nested collection queries. - Update dbcore parsing expectations so single-field terms assert SubstringQuery directly.

- Move query and sort construction into `ModelQuery` classmethods and parse tokenized input once. - Switch call sites to consume `parse_query(...).query` and `parse_query(...).sort` instead of tuple unpacking. - Remove legacy `dbcore` re-exports and obsolete helper functions in `queryparse`. - Update query/sort tests to match the new parsing behavior and simplified query object shapes.

snejus added 4 commits May 21, 2026 22:20

Move prefixes to where they are used

fe44647

Sort: rename order_clause -> clause

a00d209

dbcore: rename Database._fetch to Database.get_results

5250ae9

This removes ambiguity with Library._fetch

snejus requested a review from a team as a code owner May 21, 2026 21:30

Copilot AI review requested due to automatic review settings May 21, 2026 21:30

Copilot started reviewing on behalf of snejus May 21, 2026 21:30 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

Comment thread beets/dbcore/queryparse.py

Comment thread beets/dbcore/queryparse.py Outdated

Comment thread beets/dbcore/queryparse.py

Comment thread beets/library/queries.py Outdated

Comment thread beets/library/queries.py Outdated

Comment thread beets/dbcore/queryparse.py

snejus force-pushed the db/refactor-queryparse branch 4 times, most recently from 2d4b40e to 01e2835 Compare May 21, 2026 21:58

snejus added 11 commits May 21, 2026 23:08

Use model default_sort for library query defaults

f2573d7

- Replace default sort helpers with LibModel.default_sort as a shared, config-driven sort helper for Item and Album. - Route Library._fetch through get_results with parsed query sort, explicit sort, or model default_sort fallback.

Define QueryByField

8507c98

Move tests to test_queryparse.py

22d0d14

Define shared fixtures and fix test_queryparse.py

af3947a

Refactor queryparse tests

d9869c5

snejus force-pushed the db/refactor-queryparse branch from 01e2835 to d9869c5 Compare May 21, 2026 22:09

snejus added 2 commits May 21, 2026 23:17

Fix tests on Win

7bb2cdb

Fix tests on Win 2

eb9647f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor queryparse#6663

Refactor queryparse#6663
snejus wants to merge 17 commits into
masterfrom
db/refactor-queryparse

snejus commented May 21, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

snejus commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Query Parsing Architecture Consolidation

What this PR does

Why this matters

Architecture changes

1. Single parsing entrypoint: LibModel.parse_query(...)

Result

2. Parsing pipeline is now explicit

3. Query construction moved closer to query classes

Result

4. Query composition is simpler and flatter

High-level impact

5. Sort handling is centralized and model-driven

Result

Library and plugin integration impact

Before

After

Why this is good

Data flow after this PR

Behavioral changes reviewers should notice

1. Simpler query object shapes

2. Parsing happens once

3. Implicit path queries are normalized in parser/core flow

4. Sort suffix precedence is clearer

5. Public helper surface is reduced

Test architecture changes

What changed

Why this matters

Reviewer guide

Suggested review order

Net effect

Uh oh!

github-actions Bot commented May 21, 2026

Uh oh!

codecov Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Unsupported file format

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

snejus commented May 21, 2026 •

edited

Loading

1. Single parsing entrypoint: `LibModel.parse_query(...)`

codecov Bot commented May 21, 2026 •

edited

Loading