Skip to content

Conversation

@henry-oberholtzer
Copy link
Member

@henry-oberholtzer henry-oberholtzer commented Nov 23, 2025

Description

Fixes #6177, #6068.

I fixed the issue in #6177 by removing the derived class interface, and moving those fields back into function variables. They're a bit unwieldy still, but that's the algorithm it came with. There's a lot of room to continue to improve the clarity of the code in that section, but I think that'll require a deeper overhaul.

For #6068, I created the ArtistInfo and AlbumArtistInfo typed dictionaries, and was able to centralize the logic of building the artist info into build_artistinfo and build_albumartistinfo. Tests for these scenarios were created largely by expanding existing tests to incorporate the new fields.

I think I'll have to re-think the entire algorithm for 6030 to make it more flexible at parsing the issue for #6030, so I'll move that to a later PR in the interest of getting the flex attr fix in.

To Do

  • Changelog. (Add an entry to docs/changelog.rst to the bottom of one of the lists near the top of the document.)
  • Tests.

@henry-oberholtzer henry-oberholtzer force-pushed the discogs-fixes branch 2 times, most recently from 91ece4c to d68514e Compare December 9, 2025 07:34
@henry-oberholtzer henry-oberholtzer changed the title Discogs: fixes for: 6177, 6068, 6030 Discogs: fixes for: 6177, 6068 Dec 9, 2025
@henry-oberholtzer henry-oberholtzer marked this pull request as ready for review December 9, 2025 07:35
@henry-oberholtzer henry-oberholtzer requested a review from a team as a code owner December 9, 2025 07:35
sourcery-ai[bot]

This comment was marked as outdated.

@codecov
Copy link

codecov bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 94.26752% with 9 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.70%. Comparing base (1a899cc) to head (4ac4e19).
⚠️ Report is 5 commits behind head on master.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
beetsplug/discogs.py 94.26% 7 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6179      +/-   ##
==========================================
+ Coverage   68.68%   68.70%   +0.01%     
==========================================
  Files         138      138              
  Lines       18532    18600      +68     
  Branches     3061     3069       +8     
==========================================
+ Hits        12729    12779      +50     
- Misses       5149     5169      +20     
+ Partials      654      652       -2     
Files with missing lines Coverage Δ
beetsplug/discogs.py 74.40% <94.26%> (+3.62%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@henry-oberholtzer

This comment was marked as resolved.

@henry-oberholtzer henry-oberholtzer marked this pull request as draft December 9, 2025 17:37
@henry-oberholtzer henry-oberholtzer marked this pull request as ready for review December 10, 2025 05:21
sourcery-ai[bot]

This comment was marked as outdated.

Copy link
Contributor

@semohr semohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had some time to have a look here. Added some comments for potential improvements.

Thanks for starting to tackle the discogs plugin, it is in dire need of some love 😄

@henry-oberholtzer
Copy link
Member Author

Thanks Sebastian! I'm out of town at the moment but will get these applied once I'm back.

@semohr
Copy link
Contributor

semohr commented Dec 23, 2025

No hurry! Take your time and enjoy the holidays.

@semohr semohr self-assigned this Dec 23, 2025
@henry-oberholtzer henry-oberholtzer added this to the 2.6.0 milestone Dec 30, 2025
@henry-oberholtzer henry-oberholtzer linked an issue Dec 31, 2025 that may be closed by this pull request
index = 0
divisions: list[str] = []
next_divisions: list[str] = []
t: TracklistInfo = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to not use single character variable names? I get that they simplify things in writing but my fish brain forgets what t is after reading like 20 lines 🤣

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - they're fine for hashing out the ideas but they do make retreading the code a lot harder.

Copy link
Contributor

@semohr semohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should good here!

While the whole logic is still a bit convoluted, I feel like it is a major improvement from before the PR 👍

Thank you for taking the initiative!

@snejus snejus self-assigned this Jan 6, 2026
@henry-oberholtzer
Copy link
Member Author

Updated the variable name - can go ahead and merge if all looks good!

@semohr
Copy link
Contributor

semohr commented Jan 7, 2026

I think @snejus wants to have a look too since he self assigned this to himself too.

Copy link
Member

@snejus snejus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that it took me a long time to finally get to this! Well done, this is a big and important piece of work.

It seems to me that DiscogsPlugin now has a bit too many responsibilities. I think it would be a good idea to move the logic from _build_* methods to intermediate dataclasses which should make it easier for us to reason about it all and hopefully simplify the logic. I refactored a couple of bits as I reviewed this, so I'm opening a PR based from your last commit for you to review!

class IntermediateTrackInfo(TrackInfo):
"""Allows work with string mediums from
get_track_info"""
class AlbumArtistInfo(ArtistInfo):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❗ Note that AlbumInfo object uses artist, artists etc. fields without album prefix

artist, artist_id = self.get_artist(artist_list, join_key="join")
return self.strip_disambiguation(artist), artist_id

def _build_albumartistinfo(self, artists: list[Artist]) -> AlbumArtistInfo:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method above get_artist_with_anv is now unused and should be removed

)
artist_credit = album_artist_anv
# Information for the album artist
albumartist: AlbumArtistInfo = self._build_albumartistinfo(artist_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to use this method directly since AlbumInfo does not support albumartist* fields.

Suggested change
albumartist: AlbumArtistInfo = self._build_albumartistinfo(artist_data)
albumartist = self._build_artistinfo(artists, for_album_artist=True)

title: str
duration: str
artists: list[Artist]
extraartists: NotRequired[list[Artist]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artist definition above needs fixing:

class Artist(TypedDict):
    name: str
    anv: str
    join: str
    role: str
    tracks: str
-   id: str
+   id: int
    resource_url: str

}
],
"artists": [
{"name": "ARTIST", "anv": "", "id": 2},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure that all "artists" and "extraartists" reflect the expected Artist shape:

Suggested change
{"name": "ARTIST", "anv": "", "id": 2},
{
"name": "ARTIST",
"anv": "",
"id": 2,
"role": "",
"join": "",
"tracks": "",
"resource_url": "",
},

I'd strongly suggest defining a helper method:

def _artist(name: str, **kwargs):
    return {
        "id": 1,
        "name": name,
        "join": "",
        "role": "",
        "anv": "",
        "tracks": "",
        "resource_url": "",
    } | kwargs

Comment on lines +425 to +426
anv = a.get("anv", "") or name
role = a.get("role", "").lower()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Artist keys are always present:

Suggested change
anv = a.get("anv", "") or name
role = a.get("role", "").lower()
anv = a["anv"] or name
role = a["role"].lower()

name = config["va_name"].as_str()
anv = name
# If the artist is listed as featured
if "featuring" in role:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about cases like this one? See feat Yannou which has role: ""

>>> pprint(r.tracklist[0].data)
{
    'position': '1-01',
    'type_': 'track',
    'artists': [
        {'name': 'DJ Sammy', 'anv': '', 'join': '&', 'role': '', 'tracks': '', 'id': 47752, 'resource_url': 'https://api.discogs.com/artists/47752'},
        {'name': 'Yanou', 'anv': '', 'join': 'feat.', 'role': '', 'tracks': '', 'id': 43728, 'resource_url': 'https://api.discogs.com/artists/43728'},
        {'name': 'Do', 'anv': '', 'join': '', 'role': '', 'tracks': '', 'id': 52560, 'resource_url': 'https://api.discogs.com/artists/52560'}
    ],
    'title': 'Heaven',
    'extraartists': [
        {'name': 'Do', 'anv': '', 'join': '', 'role': 'Featuring', 'tracks': '', 'id': 52560, 'resource_url': 'https://api.discogs.com/artists/52560'},
        {
            'name': 'Frank Reinert',
            'anv': '',
            'join': '',
            'role': 'Producer [Assistant Producer]',
            'tracks': '',
            'id': 200378,
            'resource_url': 'https://api.discogs.com/artists/200378'
        },
        {'name': 'DJ Sammy', 'anv': '', 'join': '', 'role': 'Producer [Produced By]', 'tracks': '', 'id': 47752, 'resource_url': 'https://api.discogs.com/artists/47752'},
        {'name': 'Yanou', 'anv': '', 'join': '', 'role': 'Producer [Produced By]', 'tracks': '', 'id': 43728, 'resource_url': 'https://api.discogs.com/artists/43728'},
        {'name': 'Bryan Adams', 'anv': 'B.Adams', 'join': '', 'role': 'Written-By', 'tracks': '', 'id': 10933, 'resource_url': 'https://api.discogs.com/artists/10933'},
        {'name': 'Jim Vallance', 'anv': 'J.Vallance', 'join': '', 'role': 'Written-By', 'tracks': '', 'id': 266699, 'resource_url': 'https://api.discogs.com/artists/266699'}
    ],
    'duration': '3:39'
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those weird quirks with the Discogs system. If they're listed as a main artist, they'll never have a role listed, it'll just be blank, but will get assembled properly for the main artist field.

In extraartists, Do appears again, this time with featuring as the role, which will cause the featured artist to appear twice - an I noticed in #6166 Both these ways of noting an artist as featuring are apparently within discogs guidelines, and sometimes releases include them in the main artist field, both, or just the extra artists. It's really a wild west out there.

if not featured_flag:
artist += feat_str
artist_anv += feat_str
artist += name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is very complicated and hard to reason about. I think it can be simplified, and moved to a dedicated intermediate class, something along these lines:

@dataclass
class ArtistState:
    class ValidArtist(NamedTuple):
        name: str
        credit: str
        join: str
        is_feat: bool

        def get_artist(self, property_name: str) -> str:
            return getattr(self, property_name) + (
                {",": ", ", "": ""}.get(self.join, f" {self.join} ")
            )

    raw_artists: list[Artist]
    use_anv: bool
    use_credit_anv: bool
    featured_string: str
    should_strip_disambiguation: bool

    @property
    def info(self) -> ArtistInfo:
        return {k: getattr(self, k) for k in ArtistInfo.__annotations__}  # type: ignore[return-value]

    @property
    def artists_ids(self) -> list[str]:
        return [str(a["id"]) for a in self.raw_artists]

    @property
    def artist_id(self) -> str:
        return self.artists_ids[0]

    def strip_disambiguation(self, text: str) -> str:
        """Removes discogs specific disambiguations from a string.
        Turns 'Label Name (5)' to 'Label Name' or 'Artist (1) & Another Artist (2)'
        to 'Artist & Another Artist'. Does nothing if strip_disambiguation is False."""
        if self.should_strip_disambiguation:
            return DISAMBIGUATION_RE.sub("", text)
        return text

    @cached_property
    def valid_artists(self) -> list[ValidArtist]:
        va_name = config["va_name"].as_str()
        return [
            self.ValidArtist(
                self.strip_disambiguation(anv if self.use_anv else name),
                self.strip_disambiguation(anv if self.use_credit_anv else name),
                a["join"],
                is_feat,
            )
            for a in self.raw_artists
            if (
                (name := va_name if a["name"] == "Various" else a["name"])
                and (anv := a["anv"] or name)
                and (
                    (is_feat := ("featuring" in a["role"].lower()))
                    or not a["role"]
                )
            )
        ]

    @property
    def artists(self) -> list[str]:
        return [a.name for a in self.valid_artists]

    @property
    def artists_credit(self) -> list[str]:
        return [a.credit for a in self.valid_artists]

    @property
    def artist(self) -> str:
        return self.join_artists("name")

    @property
    def artist_credit(self) -> str:
        return self.join_artists("credit")

    def join_artists(self, property_name: str) -> str:
        non_featured = [a for a in self.valid_artists if not a.is_feat]
        featured = [a for a in self.valid_artists if a.is_feat]

        artist = "".join(a.get_artist(property_name) for a in non_featured)
        if featured:
            if "feat" not in artist:
                artist += f" {self.featured_string} "

            artist += ", ".join(a.get_artist(property_name) for a in featured)

        return artist

    @classmethod
    def from_plugin(
        cls,
        plugin: DiscogsPlugin,
        artists: list[Artist],
        for_album_artist: bool = False,
    ) -> ArtistState:
        return cls(
            artists,
            plugin.config["anv"][
                "album_artist" if for_album_artist else "artist"
            ].get(bool),
            plugin.config["anv"]["artist_credit"].get(bool),
            plugin.config["featured_string"].as_str(),
            plugin.config["strip_disambiguation"].get(bool),
        )

Note I'd implemented this functionality in my fork a while ago (I also parse composers and remixers), so the dataclass above takes inspiration from there.

@snejus
Copy link
Member

snejus commented Jan 9, 2026

See #6277

@JOJ0 JOJ0 added the plugin Pull requests that are plugins related label Jan 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

discogs plugin Pull requests that are plugins related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unexpected flex attr saved by Discogs plugin: medium_str discogs: add support for multi value fields

5 participants