[RFC] commit-reach: skip STALE drain when only one merge-base needed#2109
[RFC] commit-reach: skip STALE drain when only one merge-base needed#2109spkrka wants to merge 1 commit intogitgitgadget:masterfrom
Conversation
Welcome to GitGitGadgetHi @spkrka, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that either:
You can CC potential reviewers by adding a footer to the PR description with the following syntax: NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description, Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment If you want to see what email(s) would be sent for a After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail). If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the curl -g --user "<EMailAddress>:<Password>" \
--url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txtTo iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description): To send a new iteration, just add another PR comment with the contents: Need help?New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join. You may also be able to find help in real time in the developer IRC channel, |
|
There is an issue in commit a43f270:
|
a43f270 to
09c61b0
Compare
|
There is an issue in commit 09c61b0:
|
09c61b0 to
50f4c7d
Compare
derrickstolee
left a comment
There was a problem hiding this comment.
Meta comments:
- You can use your PR description to introduce yourself as a new contributor. It will not impact the commit message that is stored in the final repository, but helps folks get to know you.
- I like your custom tests. I'd be interested to know if any performance tests in t/perf/ would help. You can
cd t/perf/ && ./run HEAD~1 HEAD -- p<num>-<name>.sh. However: I checked the existing tests and I don't think that any will actually show meaningful results in the way your manual test does.
|
/allow |
|
User spkrka is now allowed to use GitGitGadget. WARNING: spkrka has no public email address set on GitHub; GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use. |
When find_all is false and generation numbers are available, the priority queue pops in non-increasing generation order. The first doubly-painted commit is a valid best merge-base; no later commit can dominate it. Skip the expensive STALE drain in this case. The early exit is guarded by three conditions: find_all must be false, the commit-graph must provide generation numbers, and the merge-base commit itself must have a finite generation (not GENERATION_NUMBER_INFINITY from being outside the commit-graph). Add find_all parameter to repo_get_merge_bases_many_dirty() and thread it through to paint_down_to_common(). git merge-base (without --all) passes show_all=0, triggering the early exit. On a 2.2M-commit merge-heavy monorepo with commit-graph: HEAD vs ~500: 5,229ms -> 24ms HEAD vs ~1000: 4,214ms -> 39ms HEAD vs ~5000: 3,799ms -> 46ms HEAD vs ~10000: 3,827ms -> 61ms Signed-off-by: Kristofer Karlsson <krka@spotify.com>
50f4c7d to
54cdf9b
Compare
|
/preview |
|
Preview email sent as pull.2109.git.1778252509243.gitgitgadget@gmail.com |
|
/submit |
|
Submitted as pull.2109.git.1778252837132.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
|
This patch series was integrated into seen via git@0b5178e. |
|
This patch series was integrated into seen via git@b7cd8e6. |
Context for what this is all about.
I am working with a very large git monorepo and have been investigating performance issue. After some digging I ended up looking more deeply into git merge-base. I saw it had an --all parameter but the default is to only return a single merge-base. Looking through the code and adding debug timing, I realized that although the total time to compute the merge-base was high, a very small amount of time was spent finding the initial merge-base value that was later returned.
The optimization is actually quite dramatic in a large repo - runtime went down from 5000ms to 50ms, so it's roughly a 100x optimization. This comes from an exploding frontier of STALE commits to drain.
Thus, my idea is simply to return early from the function once we know what will be returned. This only works if we find a candidate that we know will not be pruned later - but fortunately if we have a commit graph with generations we will visit commits in order such that it will actually not be pruned.
CC: Derrick Stolee stolee@gmail.com