feat(duplicates): add chroma plugin integration#6624
Open
ShimmerGlass wants to merge 1 commit into
Open
Conversation
8646dc0 to
5992bd1
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6624 +/- ##
=======================================
Coverage 72.38% 72.38%
=======================================
Files 159 159
Lines 20645 20689 +44
Branches 3269 3280 +11
=======================================
+ Hits 14944 14976 +32
- Misses 4976 4982 +6
- Partials 725 731 +6
🚀 New features to boost your workflow:
|
with chroma enabled, add the option to compare tracks fingerprints when duplicates are found to remove tracks with different audio from the results.
Contributor
There was a problem hiding this comment.
Pull request overview
grug see PR try let duplicates use chroma sonic fingerprint to narrow duplicate results (so grug can avoid “same mbid but different recording” trap).
Changes:
- add
--chroma+--chroma-thresholdoptions to duplicates command (only when chroma plugin loaded) - add
AcoustidPlugin.compare_items()helper in chroma plugin - add small plugin helper
find_plugin()and update docs/tests
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
beetsplug/duplicates.py |
add chroma option wiring + filtering step in duplicate grouping |
beetsplug/chroma.py |
add compare_items() API to compare two library Items via fingerprints |
beets/plugins.py |
add find_plugin(type) helper to fetch loaded plugin instance |
docs/plugins/duplicates.rst |
document chroma integration options and behavior |
docs/changelog.rst |
add changelog entry for new feature |
test/plugins/test_duplicates.py |
add tests for chroma integration behavior |
Comments suppressed due to low confidence (3)
beetsplug/duplicates.py:88
- grug think
chroma = self.config['chroma']can be true via config even when chroma plugin not loaded (then CLI flag not exist). later_chroma_filter_dupscall_chroma_plug()and get None, thenNone.compare_itemsboom. fix: when chroma option true, check_chroma_plug()not None early and raise UserError with clear message (enable chroma plugin / install pyacoustid).
def commands(self):
def _dup(lib, opts, args):
self.config.set_args(opts)
album = self.config["album"].get(bool)
checksum = self.config["checksum"].get(str)
chroma = self.config["chroma"].get(bool)
chroma_threshold = self.config["chroma_threshold"].get(float)
copy = bytestring_path(self.config["copy"].as_str())
count = self.config["count"].get(bool)
delete = self.config["delete"].get(bool)
remove = self.config["remove"].get(bool)
fmt_tmpl = self.config["format"].get(str)
full = self.config["full"].get(bool)
keys = self.config["keys"].as_str_seq()
merge = self.config["merge"].get(bool)
move = bytestring_path(self.config["move"].as_str())
path = self.config["path"].get(bool)
tiebreak = self.config["tiebreak"].get(dict)
strict = self.config["strict"].get(bool)
tag = self.config["tag"].get(str)
if album and chroma:
raise ui.UserError("cannot use chroma for albums")
beetsplug/duplicates.py:462
- grug look at
_chroma_filter_dups: it keeps itembwhen score is None or< chroma_thresh, and drops when score is high. docs/changelog say chroma should exclude tracks with different audio, so grug expect low score should be excluded, high score kept. alsopairwise(items)compare adjacent original list, so when middle item dropped, next compare still against dropped one (A-B-B-C problem) and results wrong for 3+ items. fix: decide clear rule (keep only items similar to chosen reference / build clusters) and make tests+docs match.
def _duplicates(
self, objs, keys, full, strict, tiebreak, merge, chroma, chroma_thresh
):
"""Generate triples of keys, duplicate counts, and constituent objects."""
offset = 0 if full else 1
for k, objs in self._group_by(objs, keys, strict).items():
if len(objs) <= 1:
continue
objs = self._order(objs, tiebreak)
if chroma:
objs = self._chroma_filter_dups(objs, chroma_thresh)
if len(objs) <= 1:
continue
if merge:
objs = self._merge(objs)
yield (k, len(objs) - offset, objs[offset:])
def _chroma_filter_dups(self, items, chroma_thresh):
choma_plug = self._chroma_plug()
res = [items[0]]
for a, b in itertools.pairwise(items):
score = choma_plug.compare_items(a, b)
if score is None or score < chroma_thresh:
res.append(b)
return res
test/plugins/test_duplicates.py:107
- grug see chroma tests expect high similarity (0.99) => no duplicates, and low similarity (0.5) => duplicates shown. this match current
_chroma_filter_dupslogic, but it contradict doc/changelog text that different-audio tracks should be excluded when chroma enabled. once filter fixed, these assertions likely need flip (similar => duplicates output, dissimilar => empty) or update docs if intent opposite.
@patch("acoustid.compare_fingerprints", return_value=0.99)
def test_duplicate_chroma_similar(self, cf):
self.create_dups(2)
out = self.run_cmd(chroma=True)
assert out == ""
@patch("acoustid.compare_fingerprints", return_value=0.5)
def test_duplicate_chroma_dissimilar(self, cf):
self.create_dups(2)
out = self.run_cmd(chroma=True)
assert self.dup_item.artist in out
assert self.dup_item.album in out
assert self.dup_item.title in out
Comment on lines
17
to
33
| @@ -27,6 +29,7 @@ | |||
| displayable_path, | |||
| subprocess, | |||
| ) | |||
| from beetsplug import chroma | |||
|
|
|||
Comment on lines
+95
to
+97
| sonic fingerprinting capabilities to compare the tracks audio in addition of | ||
| their ``keys``. This is especially useful when multiple versions of a song, such | ||
| as live or remixes exist in the library. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
When the chroma plugin is enabled, the duplicates plugin can now make use of
its sonic fingerprinting capabilities to compare the tracks audio in addition of
their
keys. This is especially useful when multiple versions of a song, suchas live or remixes exist in the library.
When the option is enabled, tracks with the same
keysbut different audiowill be excluded from the results.
The following additional options are available when the chroma plugin is
enabled:
same. 1 means an exact match; 0 nothing alike. Default:
0.9.To Do
docs/to describe it.)docs/changelog.rstto the bottom of one of the lists near the top of the document.)