Skip to content

Improve SearchApiMetadataSourcePlugin interface#6637

Open
amogus07 wants to merge 3 commits into
masterfrom
get-id-abstractmethod
Open

Improve SearchApiMetadataSourcePlugin interface#6637
amogus07 wants to merge 3 commits into
masterfrom
get-id-abstractmethod

Conversation

@amogus07
Copy link
Copy Markdown
Contributor

@amogus07 amogus07 commented May 14, 2026

Description

I'd like to refactor beets-vocadb to be a SearchApiMetadataSourcePlugin, but it serializes api responses to dataclass-like msgspec structs rather than TypedDicts, making it impossible to follow the class signature. Therefore, I removed the constraint from the type variable to allow plugin developers to use any type of data structure for search search results and instead have them implement custom logic to extract the id from a search result. For example, in beets-vocadb, I'd return result.id instead of result["id"].

To Do

  • Documentation. (If you've added a new command-line flag, for example, find the appropriate page under docs/ to describe it.)
  • Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
  • Tests. (Very much encouraged but not strictly required.)

@amogus07 amogus07 requested review from a team, semohr and snejus as code owners May 14, 2026 11:17
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 14, 2026

Codecov Report

❌ Patch coverage is 79.31034% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.44%. Comparing base (fc9c02a) to head (2e0475b).

Files with missing lines Patch % Lines
beetsplug/deezer.py 33.33% 2 Missing ⚠️
beetsplug/discogs/__init__.py 33.33% 2 Missing ⚠️
beetsplug/spotify.py 66.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6637   +/-   ##
=======================================
  Coverage   72.44%   72.44%           
=======================================
  Files         160      160           
  Lines       20690    20700   +10     
  Branches     3272     3272           
=======================================
+ Hits        14989    14997    +8     
- Misses       4976     4978    +2     
  Partials      725      725           
Files with missing lines Coverage Δ
beets/metadata_plugins.py 85.29% <100.00%> (+0.53%) ⬆️
beetsplug/musicbrainz.py 94.98% <100.00%> (+0.01%) ⬆️
beetsplug/deezer.py 18.30% <33.33%> (-0.13%) ⬇️
beetsplug/discogs/__init__.py 67.17% <33.33%> (-0.21%) ⬇️
beetsplug/spotify.py 62.78% <66.66%> (+0.09%) ⬆️
🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@amogus07 amogus07 changed the title Add abstract method for ID extraction in metadata plugins Add abstract method for ID extraction from search response in SearchApiMetadataSourcePlugin May 14, 2026
@snejus
Copy link
Copy Markdown
Member

snejus commented May 15, 2026

Would it be possible for you to extend your classes to support __getitem__ method to allow indexing instead?

See beets.autotag.hooks::AttrDict where we do the opposite.

@amogus07
Copy link
Copy Markdown
Contributor Author

amogus07 commented May 15, 2026

Would it be possible for you to extend your classes to support __getitem__ method to allow indexing instead?

See beets.autotag.hooks::AttrDict where we do the opposite.

I considered it but it wouldn't work because 1. the TypeVar in SearchApiMetadataSourcePlugin is bound to the IDResponse TypedDict (which I can't inherit from) and 2. I would have to use Any to annotate it which I currently don't use anywhere in the codebase and would like to avoid using.

@amogus07
Copy link
Copy Markdown
Contributor Author

I also considered making IDResponse a protocal, but I couldn't get that to work either.

@amogus07 amogus07 force-pushed the get-id-abstractmethod branch from 06b1b81 to 33447c7 Compare May 15, 2026 19:55
@snejus
Copy link
Copy Markdown
Member

snejus commented May 15, 2026

What's the reason for using msgpack.Struct in your case? IDResponse is here for a reason - we expect the results to be returned as dict types. I don't think that we are ready to relax this requirement in the core.

@amogus07
Copy link
Copy Markdown
Contributor Author

amogus07 commented May 16, 2026

Update: I could get it working now with the latest stable, except for one patch: importing TypedDict from typing_extensions instead of typing to allow subclasses to use closed=True. It's not very polished as I've mostly copy-pasted from my existing candidates methods, but it does work. Still, please leave this pr open for now as at least that small patch would have to be merged into master, and possibly the interface could be improved in a different way too.

class SongIDResponse(IDResponse, total=True, closed=True):
    data: SongForApiContract

class PluginBase(SearchApiMetadataSourcePlugin[IDResponse | SongIDResponse]):
	...
    @override
    def get_search_query_with_filters(
        self,
        query_type: QueryType,
        items: Sequence[library.Item],
        artist: str,
        name: str,
        va_likely: bool,
    ) -> tuple[str, dict[str, str]]:
        match query_type:
            case "album":
                criteria: dict[str, str] = (
                    {"barcode": barcode}
                    if (item := next(iter(items)))
                    and (barcode := item.get("barcode"))
                    else {}
                )
            case "track":
                item: library.Item
                barcode: str | None
                criteria = {}

        return name, {
            k: _v for k, v in criteria.items() if (_v := v.lower().strip())
        }

    @override
    def get_search_response(
        self, params: SearchParams
    ) -> Sequence[IDResponse | SongIDResponse]:
        match params.query_type:
            case "album":
                self._log.debug(msg=f"Searching for album {params.query}")
                remote_album_find_result: (
                    AlbumForApiContractPartialFindResult | None
                ) = self.album_api.api_albums_get(
                    query=params.query,
                    barcode=params.filters.get("barcode", ""),
                    maxResults=params.limit,
                    nameMatchMode=NameMatchMode.AUTO,
                )
                remote_album_candidates: tuple[AlbumForApiContract, ...] | None
                if not remote_album_find_result or not (
                    remote_album_candidates := remote_album_find_result.items
                ):
                    return []
                else:
                    self._log.debug(
                        msg=f"Found {len(remote_album_candidates)} result(s)"
                        + f" for '{params.query}'"
                    )
                # songFields parameter doesn't exist for album search
                # so we'll get albums by their id
                return [
                    {"id": album_id}
                    for remote_album_candidate in remote_album_candidates
                    if (
                        album_id := self._extract_id(
                            url=remote_album_candidate.id
                        )
                    )
                ]
            case "track":
                self._log.debug(msg=f"Searching for track {params.query}")
                remote_item_find_result: (
                    SongForApiContractPartialFindResult | None
                ) = self.song_api.api_songs_get(
                    query=params.query,
                    fields=SONG_FIELDS,
                    maxResults=params.limit,
                    nameMatchMode=NameMatchMode.AUTO,
                    preferAccurateMatches=True,
                    sort=SongSortRule.SONG_TYPE,
                    lang=self.language_preference,
                )
                remote_item_candidates: tuple[SongForApiContract, ...] | None
                if not remote_item_find_result or not (
                    remote_item_candidates := remote_item_find_result.items
                ):
                    self._log.debug(msg=f"Found 0 results for '{params.query}'")
                    return []
                self._log.debug(
                    msg=f"Found {len(remote_item_candidates)} result(s)"
                    + f" for '{params.query}'"
                )
                return [
                    {
                        "id": str(remote_item_candidate.id),
                        "data": remote_item_candidate,
                    }
                    for remote_item_candidate in remote_item_candidates
                ]

    @override
    def item_candidates(
        self, item: library.Item, artist: str, title: str
    ) -> Iterable[TrackInfo]:
        return filter(
            None,
            map(
                self.mapper.track_info,
                (
                    data
                    if (data := remote_song_candidate.get("data"))
                    and isinstance(data, SongForApiContract)
                    else None
                    for remote_song_candidate in self._get_candidates(
                        query_type="track",
                        items=[item],
                        artist=artist,
                        name=title,
                        va_likely=False,
                    )
                ),
            ),
        )

@amogus07 amogus07 changed the title Add abstract method for ID extraction from search response in SearchApiMetadataSourcePlugin Improve SearchApiMetadataSourcePlugin interface May 16, 2026
@amogus07 amogus07 force-pushed the get-id-abstractmethod branch 5 times, most recently from 7dd1953 to 17ff563 Compare May 16, 2026 15:06
amogus07 added 3 commits May 17, 2026 18:15
this allows subclasses of IDResponse to use closed=True
overriding them allows subclasses to customize the behavior
this allows subclassses to return an iterator for results and use
internal logic to determine the number of results without prematurely
exhausting it that case
@amogus07 amogus07 force-pushed the get-id-abstractmethod branch from 17ff563 to 2e0475b Compare May 17, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants