Skip to content

[RFC] commit-reach: skip STALE drain when only one merge-base needed#2109

Open
spkrka wants to merge 1 commit intogitgitgadget:masterfrom
spkrka:merge-base-early-exit
Open

[RFC] commit-reach: skip STALE drain when only one merge-base needed#2109
spkrka wants to merge 1 commit intogitgitgadget:masterfrom
spkrka:merge-base-early-exit

Conversation

@spkrka
Copy link
Copy Markdown

@spkrka spkrka commented May 8, 2026

Context for what this is all about.

I am working with a very large git monorepo and have been investigating performance issue. After some digging I ended up looking more deeply into git merge-base. I saw it had an --all parameter but the default is to only return a single merge-base. Looking through the code and adding debug timing, I realized that although the total time to compute the merge-base was high, a very small amount of time was spent finding the initial merge-base value that was later returned.

The optimization is actually quite dramatic in a large repo - runtime went down from 5000ms to 50ms, so it's roughly a 100x optimization. This comes from an exploding frontier of STALE commits to drain.

Thus, my idea is simply to return early from the function once we know what will be returned. This only works if we find a candidate that we know will not be pruned later - but fortunately if we have a commit graph with generations we will visit commits in order such that it will actually not be pruned.

CC: Derrick Stolee stolee@gmail.com

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

Welcome to GitGitGadget

Hi @spkrka, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

  • Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
  • Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget gitgitgadget Bot added the new user label May 8, 2026
@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

There is an issue in commit a43f270:
commit-reach: skip STALE drain in paint_down_to_common when only one merge-base needed

  • First line of commit message is too long (> 76 columns)

@spkrka spkrka force-pushed the merge-base-early-exit branch from a43f270 to 09c61b0 Compare May 8, 2026 13:53
@spkrka spkrka changed the title commit-reach: skip STALE drain when only one merge-base needed [RFC] commit-reach: skip STALE drain when only one merge-base needed May 8, 2026
@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

There is an issue in commit 09c61b0:
commit-reach: skip STALE drain in paint_down_to_common when only one merge-base needed

  • First line of commit message is too long (> 76 columns)

@spkrka spkrka force-pushed the merge-base-early-exit branch from 09c61b0 to 50f4c7d Compare May 8, 2026 13:55
Copy link
Copy Markdown

@derrickstolee derrickstolee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta comments:

  • You can use your PR description to introduce yourself as a new contributor. It will not impact the commit message that is stored in the final repository, but helps folks get to know you.
  • I like your custom tests. I'd be interested to know if any performance tests in t/perf/ would help. You can cd t/perf/ && ./run HEAD~1 HEAD -- p<num>-<name>.sh. However: I checked the existing tests and I don't think that any will actually show meaningful results in the way your manual test does.

Comment thread t/t6010-merge-base.sh
Comment thread commit-reach.c
@derrickstolee
Copy link
Copy Markdown

/allow

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

User spkrka is now allowed to use GitGitGadget.

WARNING: spkrka has no public email address set on GitHub; GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use.

When find_all is false and generation numbers are available, the
priority queue pops in non-increasing generation order.  The first
doubly-painted commit is a valid best merge-base; no later commit
can dominate it.  Skip the expensive STALE drain in this case.

The early exit is guarded by three conditions: find_all must be
false, the commit-graph must provide generation numbers, and the
merge-base commit itself must have a finite generation (not
GENERATION_NUMBER_INFINITY from being outside the commit-graph).

Add find_all parameter to repo_get_merge_bases_many_dirty() and
thread it through to paint_down_to_common().  git merge-base
(without --all) passes show_all=0, triggering the early exit.

On a 2.2M-commit merge-heavy monorepo with commit-graph:

  HEAD vs ~500:   5,229ms -> 24ms
  HEAD vs ~1000:  4,214ms -> 39ms
  HEAD vs ~5000:  3,799ms -> 46ms
  HEAD vs ~10000: 3,827ms -> 61ms

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
@spkrka spkrka force-pushed the merge-base-early-exit branch from 50f4c7d to 54cdf9b Compare May 8, 2026 14:46
@spkrka
Copy link
Copy Markdown
Author

spkrka commented May 8, 2026

/preview

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

Preview email sent as pull.2109.git.1778252509243.gitgitgadget@gmail.com

@spkrka
Copy link
Copy Markdown
Author

spkrka commented May 8, 2026

/submit

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 8, 2026

Submitted as pull.2109.git.1778252837132.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-2109/spkrka/merge-base-early-exit-v1

To fetch this version to local tag pr-2109/spkrka/merge-base-early-exit-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-2109/spkrka/merge-base-early-exit-v1

@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 9, 2026

This patch series was integrated into seen via git@0b5178e.

@gitgitgadget gitgitgadget Bot added the seen label May 9, 2026
@gitgitgadget
Copy link
Copy Markdown

gitgitgadget Bot commented May 9, 2026

This patch series was integrated into seen via git@b7cd8e6.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants