Conversation
- Test file format (valid and error cases) - Test regex pattern matching (_is_ignored) - Test _resolve_genres: ignored genres filtered - Test _resolve_genres: c14n ancestry walk blocked for ignored tags - Test _resolve_genres: without whitelist, oldest ancestor is kept - Test _resolve_genres: whithout whitelist, ignored oldest ancestor is removed
|
Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry. |
There was a problem hiding this comment.
Pull request overview
grug see PR add genre ignorelist to lastgenre plugin, so wrong Last.fm tags can be blocked global (*) or per artist. this fit plugin job: pick good genre tags and keep user control.
Changes:
- add ignorelist file parsing + regex/literal compile, plus shared
is_ignored()helper - apply ignorelist both right after Last.fm fetch and again during genre resolve/whitelist/c14n
- add docs + tests for ignorelist matching and file format/errors
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
beetsplug/lastgenre/__init__.py |
load ignorelist from config file; thread artist through resolve path; filter ignored tags during c14n + final validation |
beetsplug/lastgenre/client.py |
filter ignored tags immediately after Last.fm lookup (while keeping cache) |
beetsplug/lastgenre/utils.py |
new shared is_ignored() helper + ignorelist typing |
docs/plugins/lastgenre.rst |
document ignorelist feature + config option |
test/plugins/test_lastgenre.py |
add ignorelist tests (resolve behavior, matcher behavior, file format + errors) |
Comments suppressed due to low confidence (1)
test/plugins/test_lastgenre.py:293
- grug see this
configfixture just return upstream module-scoped config. it not reset state per test, soconfig['lastgenre']['ignorelist'](and other keys) can leak between tests and even breakLastGenrePlugin()init when prior test set ignorelist path then delete file. grug suggest make this fixture actually reset needed keys (ex: setlastgenre.ignorelist = Falsehere) or use function-scoped fresh config object.
@pytest.fixture
def config(config):
"""Provide a fresh beets configuration for every test/parameterize call
This is necessary to prevent the following parameterized test to bleed
config test state in between test cases.
"""
return config
| elif obj.albumartist != config["va_name"].as_str(): | ||
| new_genres = self.client.fetch_artist_genre(obj.albumartist) | ||
| stage_label = "album artist" | ||
| stage_artist = obj.albumartist | ||
| if not new_genres: |
There was a problem hiding this comment.
grug see multi-valued album artist fallback collect genres from many obj.albumartists, but stage_artist stay set to obj.albumartist. then later resolve step apply artist-specific ignorelist for wrong artist and can drop genre that came from other albumartist. grug think when stage_label become "multi-valued album artist", set stage_artist = None (global only) or handle per-artist filtering another way.
There was a problem hiding this comment.
addressed and added a test for it
docs/plugins/lastgenre.rst
Outdated
| fracture: | ||
| ^(heavy|black|power|death)?\s?(metal|rock)$|\w+-metal\d*$ | ||
| progressive metal |
There was a problem hiding this comment.
grug think regex example in code block wrong: it show double backslash (\\s, \\w, \\d). in ignorelist file user must write single backslash (\s, \w, \d) or regex will match literal "\s" text, not whitespace/word/digit. please change example to single backslashes so copy/paste work.
- Prevents wrong last.fm genres based on a per artist (or global) list of regex
patterns that should be ignored.
- Genre _ignoring_ happens in two places: Right after fetching the
last.fm genre and in _resolve_genres.
- As a fallback literal string matching can be used instead of
supplying a regex pattern
- Includes docs for the new feature.
and fix fallback types
1400171 to
c17eebb
Compare
|
@snejus this is now ready for a first (human) review except the copilot comments I will address. Thanks in advance :-) |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6449 +/- ##
==========================================
+ Coverage 69.73% 69.91% +0.18%
==========================================
Files 145 146 +1
Lines 18534 18575 +41
Branches 3023 3027 +4
==========================================
+ Hits 12924 12987 +63
+ Misses 4976 4957 -19
+ Partials 634 631 -3
🚀 New features to boost your workflow:
|
|
Just had a quick look at the description and wanted to suggest that ideally we want to stay within the beets configuration YAML for the regex patterns. See my configuration: #6269 (comment), specifically What specific issue YAML poses in your case? |
|
Cant recall specifically, it was a year ago already. Loads of loads of excaping this and that. regexes bacame tedious to write and even read. Will take a look at how you implemented regex patterns within yaml. What oposes not a separate format? whitelist and c14n is something else anyway. I'd say it's simpler to use. Main reason. I gave up back then trying to use ini or yaml for regexes. It was simply painful to use 🤷 this new format I will use as-is for genre aliases too btw. also a reason why I think it's the best way forward for the project. |
Personally it's about the convenience - I want all my configuration to live in a single file. And if it's getting unwieldy, I can move, say, For example, my directory: /mnt/music/Music
library: ~/.music/beets/library.db
include:
- replacements.yaml |
|
Also I don't see how a separate file improve the regex definitions. Note that artwork:
pattern: |
\b(?i:art(?:work)?|cover)\b
[^|:,.\n-]*?
(?:[-:]\ ?|by:?\ )
(
https:[^ ]+
| (
[\w@]
(?:[^|:,.\n-]|[.]\w)+
[^|:,.\n -]
)+
)
([|:,.\n \"-]|$)Then when you read such configuration in python, make sure to specify |
|
Please don't get me wrong: I understand that yaml is our config format and my first versions of this feature when I started about a year ago was of course trying to make it work with yaml. And yes I am certainly aware of yaml includes and I do use them too :-) Problems I had were mainly escaping issues, yaml treating regexes that work elsewhere in python do not work when put into yaml. I went to escaping hell and back. I also tried ini as an alternative and that was a little better but still painful. At some point I decided that it is not worth ot and won't ever be userfriendly, so I put thought and effort into finding the most simple solution from a user perspective. Open file - add regex from regex101 or self constructed - see if it compiles - and done. No special treatment so yaml accepts it. That is user friendly and overrules "having everything in one file", which you say for yourself you don't do anyway when stuff gets too big. For last.fm genre ignores, believe me you will have a separate file because it gets big. So your argument is not too strong here. Anyway I'm not at home/at computer these days and I won't argue further here, so if you insist that yaml is the better way to go, I will do it. It will take some time and it's a bit of a bummer that sets me back from finally getting this over with... If you help me with escaping my regexes for yaml I will be grateful I see weird things in your config, what is this for example? |
multiline regexes dont solve the escaping problems, also I don't really need them for my usecase. What are you trying to say here? |
|
And just so we know what we are actually talking about, this format is really straight forward and a pleasure to read and edit. Not even any quoting required ❤️ |
|
To further give you the full picture of my reasoning: I took quite some ideas for the aliases and iignores features from discontinued whatlastgwnre 3rd party plugin. It had curated lists of very useful patterns for the aliases features. They worked well for me back then when I used the plugin regularly, so I wanted to continue use them and also include them in my lastgenre implementation of these feature ideas. I still plan to ship a curated list of aliases with beets! Stuff that works for everyone. Shipping in a sep file is easy. I tried hard to make them work in yaml and failed. It was a never ending story escaping everything so yaml accepts it: I'm giving you all this so you might better understand that I did my due diligence, made my homework and had a long and thoughtful path until I decided that a custom format is the best and most of all userfriendly way. If you still insist that yaml is the best way I will start doing it. But please do note that I put a lot of thought, work and experimentation into my decisions and it's not something I just decided out of the blue. |
I want to understand this better. Can you give me an example where this is an issue? |
66a7d98 to
1785dab
Compare
|
Hi again @snejus, I refactored this to have everything right within the beets config. I added some details to docs and docstrings user's should be cautious about when putting there regex into yaml. I won't be able to recall excatly errors from when I first started with this feature. In the end it doesnt matter. I can only suppose that I might have tried to put regex into double quotes which is a very bad idea. Also I might have had a lot of errors with ini as well and all that added up to the conclusion that yaml and ini are bad -> Invent something that just works and most of all is more user friendly. I still find my first version the most user friendly and it didn't require any "caution don't do this, don't do that" stuff in the docs ;-) But I now learned more about it and it is not too much to worry about. In most cases with regex in yaml it is just best to not use any single or double quotes at all and most stuff will work. So @snejus the 3 things I'd appreciate if you took special attention to in another review are:
Currently existing genres check happens in resolve_genre, and it is applied possibly more than once. I'm just running through that again to see if there is a redundant check in there. |
1785dab to
c5a14d3
Compare
You can use them as long as you use yaml literal (or scalar) style blocks: configuration: |
you can use
any
characters
'"
you likeWill review this tomorrow! |
Description
Adds a global and artist-specific genre ignorelist to lastgenre. Ignorelist entries can use regex patterns or literal genre names and are configurable per artist or globally (*). Example:
FIXME config
FIXME description
Further notes
My personal roadmap for the
lastgenreplugin: #5915Feature idea originated from: #5721 (comment).
To Do