Replicas should atomically update their distributions#7492
Replicas should atomically update their distributions#7492
Conversation
95d534b to
8c05e06
Compare
2ca9bc6 to
88a62e8
Compare
It's not ideal for distributions to pick up replica changes at random time intervals as various tasks complete, ideally the entire replica is presented as updated at once (or with the smallest possible window). closes pulp#7333 Assisted-By: claude-opus-4.6
88a62e8 to
a92f9e2
Compare
| elif upstream_distribution.get("repository_version"): | ||
| # Extract repository href from repository_version href | ||
| repo_href = upstream_distribution["repository_version"].rsplit("versions/", 1)[0] | ||
| manifest = self.repository_ctx_cls(self.pulp_ctx, repo_href).entity["manifest"] |
There was a problem hiding this comment.
This is tricky, it implies that other plugins may also have compatibility issues without modification
There was a problem hiding this comment.
I believe pulp_python may also need a tweak
|
Pulp Python: pulp/pulp_python#1174 Pulp Container: pulp/pulp_container#2301 |
pulpcore/app/replica.py
Outdated
| if k not in ("repository_version", "repository", "publication") | ||
| } | ||
| create_data["name"] = upstream_distribution["name"] | ||
| create_data["pulp_labels"] = distribution_data["pulp_labels"] |
There was a problem hiding this comment.
I remember putting the pulp_labels outside of distribution_extra_fields since not every plugin called the super method when they overwrote it, but maybe with this change we could do that now.
645e954 to
56c4df9
Compare
4087809 to
0b10f41
Compare
|
Want me to squash the dead code commit? |
pulpcore/app/replica.py
Outdated
| # Clear repository and publication so they don't conflict when | ||
| # finalize_replication sets repository_version atomically. |
There was a problem hiding this comment.
While this is needed for the update, we are now going to create a scenario on the first replication after upgrade where distros will start returning 404s until the replication completes.
There was a problem hiding this comment.
@gerrod3 You're talking about the very first replication after the upgrade, before the repository version has been set? But after that it should be OK, right?
Maybe that's an acceptable cost as long as it's documented - unless you would rather be more lenient with clearing it now at the expense of being a bit less idempotent? I guess we would then need to deal with the possibility that repository might be set later on, and clear it when setting the repository_version?
There was a problem hiding this comment.
Why don't we just remove the repository and publication fields from extra_data and then in the atomic update set both to None. This way we guarantee a smooth update and replication will always correctly update the distribution to the next repo-version even if the user was meddling with them.
There was a problem hiding this comment.
This approach is simpler and I think I prefer it, but it will still have a problem on the first replication, which is that the existing replicated repos which have repository set will still not have them cleared until the end. Therefore, the first replication post-upgrade would likely have the old non-atomic behavior.
Honestly that is still better than 404s though.
11ca478 to
7c33cd5
Compare
gerrod3
left a comment
There was a problem hiding this comment.
Thanks for all the changes, looks good.
| "repository": get_url(repository), | ||
| "publication": None, |
7c33cd5 to
8434df5
Compare
Generated-By: claude-opus-4.6
8434df5 to
9e2158e
Compare
It's not ideal for distributions to pick up replica changes at random time intervals as various tasks complete, ideally the entire replica is presented as updated at once (or with the smallest possible window).
closes #7333
📜 Checklist
See: Pull Request Walkthrough