-
Notifications
You must be signed in to change notification settings - Fork 16
git(hub) integration feature #791
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
sgfost
wants to merge
31
commits into
comses:refactor/metadata-generation
from
sgfost:feat/git-mirroring
Closed
git(hub) integration feature #791
sgfost
wants to merge
31
commits into
comses:refactor/metadata-generation
from
sgfost:feat/git-mirroring
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
73bee28 to
b25ba7b
Compare
b25ba7b to
a64a05e
Compare
https://huey.readthedocs.io/ the huey consumer runs as a runit daemon in the server service the default dev mode behavior of immediate=False (tasks run synchronously) is currently disabled for testing purposes
adding these manually was an easily forgotten step that wouldn't be noticed in dev but would fail to build in prod
codemeta_snapshot will be used to keep a codemeta data structure updated along with changes to metadata, which makes it easier to watch for changes and speeds up access license text is created from a template for each release and included in the fs package as LICENSE file ref comses/planning#234
and replace redis caching of codebase all_contributor lists with querysets (did save a few queries but doesn't seem to have any meaningful performance impact) codebases were considering any citable release contributor as an author and releases considered anyone with a role of "author" to be an author. Now we use a union of the two -- not sure if this is the best way but regardless, its easier to change since it all stems from authors() and nonauthors() on the ReleaseContributorQuerySet codebase and release both now have 'nonauthor contributor' accessors, which is useful because this is what things like codemeta/datacite/etc. consider 'contributors'
move metadata conversion to a metadata module which provides converters for different formats. codemeta is used as the primary format which the others (datacite, cff) can be derived from the primary codemeta accessor is the codemeta_snapshot json field, which is rebuilt each time a codebase/release is saved * add `update_codebase_metadata` command to update the codemeta snapshot for all objects, then update packages on the fs * add CITATION.cff file to fs package
usage of the old datacite metadata generation still needs to be replaced
* visually indicate that the release metadata form is saving, since this takes a little bit longer now
17f1895 to
9cbc597
Compare
a64a05e to
81edfd5
Compare
and resolve some edge case bugs with metadata generation. test_codemeta was primarily checking to make sure that codemeta was conforming to the expected schema, and this is implicit now we may still want some test module that uses hypothesis, but it would be even more useful to do this at a higher level e.g. create a bunch of codebase+releases and see if anything goes wrong downstream
this API is responsible for managing a local git repository mirror for a comses codebase. PUBLIC release archives are commits/tags in the history. Release branches are created for each release and only added to if there is an update to metadata `build()` and `append_releases()` are the two main API methods which construct (or rebuild) a git repo and add new releases to the repo, respectively `update_release_branch()` will add a new commit containing changes to a release branch (and update main if they point to the same thing). This will mainly be used for updating metadata
the GithubApi provides access to auth and repository actions adds 3 huey (async) tasks for creating a mirror, updating a mirror, and updating metadata for a single release of a mirror
* /github page to describe the integration features * sidebar element on release detail page will show information about integration status for that codebase, and allow users with edit permissions to create a new mirror
a92874c to
8c42b35
Compare
sgfost
referenced
this pull request
Jan 23, 2025
- include in .env PATH and make docker-compose.yml target depend on .env - replace deprecated usage of self.assertEquals https://docs.python.org/3/whatsnew/3.12.html#id3
8c42b35 to
baed067
Compare
* use installation access tokens for user repos instead of user access tokens. this is a more secure workflow * add GithubIntegrationAppInstallation model for recording app installations (this will need to be created/updated using webhooks) * CodebaseGitMirror/"mirror" now refers to the local git repository * ^ can have multiple CodebaseGitRemote's which keep track of all the information needed to push to/archive from remote repositories TODO: re-implement views
baed067 to
d3432cb
Compare
this replaces the simple modal form to give better control over the feature
The distinguishing feature is whether the release has a non-null external_release_package This will be used to 'archive' or pull in releases made on github for synced repos currently, the release assets/package is not stored on the filesystem, instead relying on an external download url, and being only concerned with metadata
App installation tokens did not give access to get/create repos on a user's account. Still trying to avoid storing user access tokens (https://docs.github.com/en/apps/creating-github-apps/authenticating-with-a-github-app/generating-a-user-access-token-for-a-github-app) so as a workaround, we will direct the submitter to create a new bare repo before continuing * add handler for webhook events for the github sync app * add form/wizard for linking a pre-existing github repo (archive only)
allows setting up a push/archive sync that will automatically have the generated git repo pushed to by providing an empty repo, as well as setting up an archive only sync by providing any github repo in both cases, the submitter needs to: - link their github account with the regular oauth flow (so we have a way to match users with a github account) - install the provided 'GitHub Sync' app on their github account with access to any repository that will be synced
and squash migrations "import(ing)" is the wording I keep finding describes the process the best other potential names and their issues: * publish - same name as the direct publishing, releases need to be manually published after they are imported * pull - git command that is not used * fetch - git command that is not used * ingest - ok and similar to import but not quite as clear
b4786f6 to
8c05280
Compare
* re-order and clarify the steps to set up a sync (app installation takes place after creating a repo so that permissions can be restricted) * fix push log to actual show useful information * when toggling push back on, do a build/update + push on the spot * better error/success messages
ded78e4 to
f3b298d
Compare
refactors the `CodebaseReleaseFsApi` to inherit from an abstract `BaseCodebaseReleaseFsApi`, along with the new `ImportedCodebaseReleaseFsApi` the main function of the imported release fs api is to import releases from a remote source by downloading an archive to originals/, extracting to sip/data/ and then using the inherited functionality to build archives for review/publishing. It does not implement methods for dealing with files directly outside of this importing workflow like add(), delete(), rebuild(), etc. the imported release fs api manages a manifest for keeping track of file categories. Using this for all releases is potentially something to do in the future but is a rather complicated refactor that had to be dialed back here
* webhook handler watches for github release events and creates a db record and delegates to the FS api to download and extract a release archive into the library FS TODO: - tailor the release editor UI for imported releases, including a way to categorize files - extract initial metadata from the github release archive
including file categorization with an improved file tree
currently pulls out: license, release notes, languages, platforms, os
from: github repo/release data, any found codemeta file, any found
CITATION.cff file
Contributor
Author
|
I believe all of the main functionality for the 2-way sync is now implemented and works for simple cases. Likely not bulletproof nor polished so I now need to do lots and lots of testing |
a4e039d to
4b4dd19
Compare
and fix some typescript complaints
4b4dd19 to
d00c9b3
Compare
2254cdb to
a3d2e14
Compare
Contributor
Author
|
replaced by #815 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
this relies on changes in #790
part 1 (1-way mirror):
adds a button that allows model submitters to create an auto-updating, read-only git repository archive which is hosted on a central organization
the 'mirror' git repository consists of commits for each release that are tagged and branched off so that metadata can be updated for individual releases without re-writing history
additions
library.fs.CodebaseGitRepositoryApi: functionality for building/updating a git repository from aCodebaselibrary.github.GithubApi: provides an interface overPyGithubfor interacting with repositories on githublibrary.github.GithubRepoNameValidator: providesvalidate()to make sure a repo name is valid and unusedmirror_codebase()andupdate_mirrored_codebase()huey tasks which call theCodebaseGitRepositoryApito build the git repo on the file system and thenGithubApito create/push to the remoteupdate_mirrored_release_metadata()huey task which is triggered when there is an update to codemeta, and updates the corresponding release branch in git after the submission package is rebuilt/github/configuration steps
comses-model-libraryorganization with the following permissions:<HOST>/github-sync-webhook/** the trailing slash is very important for some reasonReleasewebhook event.env.envsecrets/