Skip to content

Refactor queryparse#6663

Open
snejus wants to merge 17 commits into
masterfrom
db/refactor-queryparse
Open

Refactor queryparse#6663
snejus wants to merge 17 commits into
masterfrom
db/refactor-queryparse

Conversation

@snejus
Copy link
Copy Markdown
Member

@snejus snejus commented May 21, 2026

Query Parsing Architecture Consolidation

What this PR does

This PR reshapes query parsing around a single model-level entrypoint and moves parsing/building responsibilities closer to the objects that own them.

At a high level, the change:

  1. Replaces scattered query/sort helper functions with LibModel.parse_query(...).
  2. Centralizes parsing into ModelQuery, QueryTerm, and SortTerm.
  3. Moves field-specific query construction into query classes via FieldQuery.from_model(...).
  4. Updates library and plugin call sites to consume the new API consistently.
  5. Cleans up and modernizes the test suite around the new parsing model.

Why this matters

Before:

  • Query parsing logic was split across multiple helper functions.
  • Callers had to know which parsing helper to use.
  • Field normalization and query construction were partly model-owned, partly parser-owned.
  • Tests were spread across unrelated files and still used older unittest patterns.

After:

  • Parsing has one clear entrypoint: Item.parse_query(...) / Album.parse_query(...).
  • Query terms, sort terms, and final model query assembly each have distinct responsibilities.
  • Query classes now own more of their own normalization and construction behavior.
  • The parser tokenizes input once and builds both query and sort from the same token stream.
  • Tests better reflect the actual parsing contract.

Architecture changes

1. Single parsing entrypoint: LibModel.parse_query(...)

The biggest architectural change is that model classes now own parsing through LibModel.parse_query(...), which returns a ModelQuery object:

parsed = Item.parse_query('artist:foo year+')
query = parsed.query
sort = parsed.sort

This replaces older helper-style flows such as:

  • parse_query_string(...)
  • parse_query_parts(...)
  • query_from_strings(...)
  • sort_from_strings(...)
  • parse_sorted_query(...)

Result

Reviewers can now think about query parsing as:

input -> model.parse_query(...) -> ModelQuery(query, sort)

instead of multiple helper paths.


2. Parsing pipeline is now explicit

The new parsing flow is layered:

raw string / token list
    ->
`ModelQuery.parse(...)`
    ->
tokenization via `shlex`
    ->
`QueryTerm` parses search terms
`SortTerm` parses sort directives
    ->
`ModelQuery.get_query(...)`
`ModelQuery.get_sort(...)`
    ->
final executable `Query` + `Sort`

This makes responsibilities much clearer:

  • QueryTerm: parse one search term like 'year:1999' or '/tmp'
  • SortTerm: parse one sort directive like 'artist+'
  • ModelQuery: orchestrate the full query string

3. Query construction moved closer to query classes

A second major design improvement is that field-aware query creation now lives in query classes themselves via FieldQuery.from_model(...).

That method now handles concerns that used to be distributed elsewhere:

  • legacy field alias replacement
  • shared-table field qualification
  • path normalization
  • query-specific pattern normalization via normalize_pattern(...)

Result

Instead of models and parsers coordinating low-level query setup, the query class now owns how it should be built for a model.

That is a cleaner boundary:

parser decides "what query type"
query class decides "how to build itself correctly"

4. Query composition is simpler and flatter

Query.__or__ was added, and both AndQuery and OrQuery now flatten when combined.

This matters because the parser now builds final query trees through operator composition rather than repeatedly wrapping collection queries.

High-level impact

The resulting query objects are simpler:

  • fewer unnecessary one-element wrappers
  • more direct TrueQuery, NumericQuery, etc.
  • flatter AndQuery / OrQuery trees

This is why several tests changed from asserting on nested subqueries[0] shapes to asserting directly on the final query type.


5. Sort handling is centralized and model-driven

Sort parsing now follows the same pattern as query parsing:

  • SortTerm parses and validates sort tokens
  • sort resolution happens against model metadata
  • default sort resolution lives on LibModel.default_sort

Also, sort case sensitivity is now owned by FieldSort rather than threaded through parsing helpers.

Result

Sorting behavior is:

  • easier to reason about
  • less coupled to parser plumbing
  • more naturally configured at the sort layer

Library and plugin integration impact

This PR updates multiple call sites to use the new model API consistently.

Before

query, sort = parse_query_string(q, Item)

or

query, _ = Item.parse_query(q)

After

parsed = Item.parse_query(q)
query = parsed.query
sort = parsed.sort

This affects core library flow and plugin consumers such as:

  • advancedrewrite
  • convert
  • ihate
  • lyrics
  • smartplaylist
  • aura
  • bpd

Why this is good

All callers now depend on the same contract:

  • models parse queries
  • callers consume .query and .sort

That reduces API surface and keeps parsing behavior consistent across the application.


Data flow after this PR

User/config/plugin query string
    ->
`Item.parse_query(...)` / `Album.parse_query(...)`
    ->
`ModelQuery.parse(...)`
    ->
build `query` and `sort`
    ->
`Library.get_results(...)` / plugin matching logic

And inside parsing:

'foo bar, baz year+'
    ->
search terms: `foo`, `bar`, `baz`
sort terms: `year+`
    ->
(`foo` AND `bar`) OR `baz`
+
sort by `year`

Behavioral changes reviewers should notice

This is mostly a refactor, but there are a few meaningful parser-contract changes:

1. Simpler query object shapes

Empty and single-term parses now often return direct queries like TrueQuery or NumericQuery instead of wrapper collection queries.

2. Parsing happens once

ModelQuery.parse(...) tokenizes once and derives both query and sort from the same token stream.

3. Implicit path queries are normalized in parser/core flow

Path-like input such as '/tmp' becomes a path query earlier and more consistently.

4. Sort suffix precedence is clearer

Strings ending in '+' or '-' are treated as sort directives when valid.

5. Public helper surface is reduced

Legacy dbcore parsing re-exports/helpers are removed in favor of the model API.


Test architecture changes

The test portion of this PR is not just cleanup; it aligns the suite with the new architecture.

What changed

  • Query parsing tests were moved into dedicated test/dbcore/test_queryparse.py
  • shared test models/sorts/queries were extracted into beets.test.fixtures
  • older unittest.TestCase style tests were rewritten into grouped pytest classes
  • many repetitive cases were collapsed into @pytest.mark.parametrize(...)

Why this matters

The tests now mirror the actual parser design:

  • TestQueryTermParsing
  • TestQueryFromParts
  • TestSortFromParts
  • TestParseSortedQuery
  • TestModelQueryErrors

That makes the parser contract easier to review and safer to extend.


Reviewer guide

If reviewing for architecture, the most important files are:

  • beets/dbcore/queryparse.py
  • beets/dbcore/query.py
  • beets/library/models.py
  • beets/library/library.py
  • beets/test/fixtures.py
  • test/dbcore/test_queryparse.py

Suggested review order

  1. beets/dbcore/queryparse.py
    • new parsing model
  2. beets/dbcore/query.py
    • query construction and composition changes
  3. beets/library/models.py
    • new model entrypoint and default sort ownership
  4. plugin call sites
    • API adoption
  5. tests
    • expected behavior and edge cases

Net effect

This PR is a query parsing consolidation.

It moves the system from a helper-driven design to a model-centered parsing API with clearer ownership boundaries:

  • models own parsing entrypoints
  • parser objects own syntax handling
  • query classes own model-aware construction
  • sort classes own sort behavior
  • callers consume one consistent parsed result

The overall impact is a smaller public surface, clearer layering, and a test suite that better documents the intended behavior.

snejus added 4 commits May 21, 2026 22:20
This removes ambiguity with Library._fetch
Move sort case-sensitivity config lookup into `FieldSort` as a cached
class property instead of threading a `case_insensitive` flag through
query parsing helpers.

This centralizes sort behavior at the sort implementation boundary and
simplifies parse/orchestration call paths while preserving configurable
case-sensitive ordering.
@snejus snejus requested a review from a team as a code owner May 21, 2026 21:30
Copilot AI review requested due to automatic review settings May 21, 2026 21:30
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

⚠️ Unsupported file format

Upload processing failed due to unsupported file format. Please review the parser error message:
Error parsing JUnit XML in /home/runner/work/beets/beets/.reports/pytest.xml at 1:88641

Caused by:
RuntimeError: Error converting computed name to ValidatedString

Caused by:
    string is too long</code></pre>

For more help, visit our troubleshooting guide.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

PR move query parsing from many helper path into one model entrypoint (LibModel.parse_query), then build query+sort via ModelQuery / QueryTerm / SortTerm. PR also update core + plugins + tests to use new API.

Changes:

  • Replace scattered parse_query_* + dbcore.parse_sorted_query usage with Item/Album.parse_query(...) returning ModelQuery(query, sort).
  • Move field-aware query construction into FieldQuery.from_model(...) and flatten query composition via Query.__or__ + AndQuery/OrQuery flattening.
  • Restructure tests: new test/dbcore/test_queryparse.py and shared model fixtures in beets/test/fixtures.py.

grug note: big change, many place for bug hide. grug want fix few correctness holes before merge.

Reviewed changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/test_library.py Remove legacy parse-query unittest cases (moved/updated elsewhere).
test/test_dbcore.py Update fixtures usage + switch tests from _fetch to get_results; remove old queryparse tests.
test/plugins/test_smartplaylist.py Update smartplaylist tests to use Item/Album.parse_query and new sort expectations.
test/dbcore/test_sort.py Update sort tests for new parsing + config-driven case sensitivity; clear cached classproperty cache.
test/dbcore/test_queryparse.py New pytest coverage for QueryTerm, ModelQuery, and sort parsing behavior.
beetsplug/smartplaylist.py Switch playlist query parsing to model_cls.parse_query; adjust typing.
beetsplug/playlist.py Update typing alias for item query registration (QueryByField).
beetsplug/play.py Use Album.default_sort instead of library helper.
beetsplug/lyrics.py Use Item.parse_query for auto-ignore query parsing.
beetsplug/ihate.py Use model_cls.parse_query(...).query for match checks.
beetsplug/convert.py Use Item.parse_query(...).query for no_convert matching.
beetsplug/bpd/init.py Switch to FieldQuery.from_model / any_writable_media_field_query new arg order.
beetsplug/aura.py Switch field queries to FieldQuery.from_model.
beetsplug/advancedrewrite.py Replace manual shlex + helper parsing with Item.parse_query.
beets/test/fixtures.py New shared fixtures for dbcore query/sort/model tests.
beets/plugins.py Update query-prefix typing and comprehension-based merge of plugin query prefixes.
beets/library/queries.py Deprecate old query parsing helpers and forward to model parsing.
beets/library/models.py Add LibModel.parse_query + default_sort; migrate helper query construction to FieldQuery.from_model.
beets/library/library.py Make library fetching accept string/parts/query; route to model_cls.parse_query and Database.get_results.
beets/dbcore/types.py Tighten Type.query typing to type[FieldQuery].
beets/dbcore/sort.py Rename order_clause -> clause; move case-sensitivity ownership into FieldSort.
beets/dbcore/queryparse.py New parsing pipeline (ModelQuery, QueryTerm, SortTerm) built on shlex token stream.
beets/dbcore/query.py Add FieldQuery.from_model + pattern normalization hooks; flatten AndQuery/OrQuery composition.
beets/dbcore/db.py Rename internal fetch API to get_results; switch to Sort.clause().
beets/dbcore/init.py Stop re-exporting old queryparse helper functions.

Comment thread beets/dbcore/queryparse.py
Comment thread beets/dbcore/queryparse.py Outdated
Comment thread beets/dbcore/queryparse.py
Comment thread beets/library/queries.py Outdated
Comment thread beets/library/queries.py Outdated
Comment thread beets/dbcore/queryparse.py
@snejus snejus force-pushed the db/refactor-queryparse branch 4 times, most recently from 2d4b40e to 01e2835 Compare May 21, 2026 21:58
snejus added 11 commits May 21, 2026 23:08
- Introduce SortTerm to centralize sort-part validation and direction parsing.
- Move model-specific sort resolution into SortTerm.get_sort, including smart
artist field mapping.
- Simplify sort_from_strings to build sorts from parsed terms.
- Replace parse_query_part with a QueryTerm parser that captures negate, field,
  prefix, and pattern in one pass.
- Resolve query class selection through QueryTerm so prefix and field-based
  dispatch are handled consistently.
- Move implicit path query normalization into dbcore parsing, simplifying
  library-level query parsing.
- Tighten plugin query typing and update dbcore tests for path behavior and
  related fixture assertions.
- Replace default sort helpers with LibModel.default_sort as a shared,
  config-driven sort helper for Item and Album.
- Route Library._fetch through get_results with parsed query sort, explicit sort,
  or model default_sort fallback.
- Add ModelQuery in dbcore.queryparse and expose LibModel.parse_query as
  the single parsing entrypoint for string and sequence query inputs.
- Route library fetch, path format matching, and plugin query parsing
  through the model-level API, and deprecate
  beets.library.parse_query_string and parse_query_parts.
- Update tests to cover invalid query parsing via ModelQuery and align
  smartplaylist sort assertions with parsed sort behavior.
- Move field alias handling, shared-table qualification, and path pattern
  normalization into FieldQuery.from_model with a query-specific
  normalize_pattern hook.
- Update query parsing and plugin call sites to construct queries through
  query_cls.from_model and remove redundant model-level field query helpers.
  This keeps model entrypoints focused on orchestration and colocates
  transformation behavior with the query types that own it.
- Move per-term query building into QueryTerm.get_query and compose
  field queries with operator-based OR logic.
- Add Query.__or__ plus flattening behavior for AndQuery and OrQuery to
  avoid nested collection queries.
- Update dbcore parsing expectations so single-field terms assert
  SubstringQuery directly.
- Move query and sort construction into `ModelQuery` classmethods and parse
  tokenized input once.
- Switch call sites to consume `parse_query(...).query` and
  `parse_query(...).sort` instead of tuple unpacking.
- Remove legacy `dbcore` re-exports and obsolete helper functions in
  `queryparse`.
- Update query/sort tests to match the new parsing behavior and simplified
  query object shapes.
@snejus snejus force-pushed the db/refactor-queryparse branch from 01e2835 to d9869c5 Compare May 21, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants