Avoid unnecessary DisjunctionMaxBulkScorer overhead by shimpeko · Pull Request #15659 · apache/lucene

shimpeko · 2026-02-02T22:10:01Z

This change inspects clause bulk scorers up front and only uses DisjunctionMaxBulkScorer if at least one clause provides a non-default BulkScorer. Otherwise, we fall back to the scorer-based path.

c88f933 made DisjunctionMaxQuery use DisjunctionMaxBulkScorer when tieBreakerMultiplier == 0 and scoreMode == TOP_SCORES. However, this bulk path primarily pays off when at least one clause implements a specialized BulkScorer.

When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations that the scorer-based DisjunctionMaxScorer supports in TOP_SCORES mode.

In such cases, falling back to the scorer-based path typically results in better performance and restores competitive-score-based skipping.

Fixes: #15658
Related PR: #14040

shimpeko · 2026-02-03T00:54:24Z

I realized that we cannot call ScorerSupplier.get() more than once. Need another solutions.

This change inspects clause bulk scorers up front and only uses DisjunctionMaxBulkScorer if at least one clause provides a non-default BulkScorer. Otherwise, we fall back to the scorer-based path. c88f933 made DisjunctionMaxQuery use DisjunctionMaxBulkScorer when tieBreakerMultiplier == 0 and scoreMode == TOP_SCORES. However, this bulk path primarily pays off when at least one clause implements a specialized BulkScorer. When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations that the scorer-based DisjunctionMaxScorer supports in TOP_SCORES mode. In such cases, falling back to the scorer-based path typically results in better performance and restores competitive-score-based skipping. Fixes: apache#15658 Related PR: apache#14040

shimpeko · 2026-02-03T10:08:48Z

I realized that we cannot call ScorerSupplier.get() more than once. Need another solutions.

I updated the code not to call ScorerSupplier.get() more than once.

uschindler · 2026-02-03T11:20:49Z

Hi, it would be nice to not always force push, this makes reviewing hard. I have no idea if you have fixed the linter warnings. Squashing the commits is done on our side while merging, please don't do it yourself.

Otherwise the change looks fine to me. but I leave it to @jpountz as he understand the code much better.

uschindler · 2026-02-03T11:23:11Z

Please fix the linter warning by running "./gradlew tidy" and commit, but don't sqash!

> Task :lucene:core:checkGoogleJavaFormat FAILED
java file(s) have google-java-format violations (run './gradlew tidy' to fix). An overview diff of changes:
== /home/runner/work/lucene/lucene/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxQuery.java
@@ -181,8 +181,7 @@
 ············}
 
 ············return·new·Weight.DefaultBulkScorer(
-················new·DisjunctionMaxScorer(tieBreakerMultiplier,·scorers,·scoreMode,·Long.MAX_VALUE)
-············);
+················new·DisjunctionMaxScorer(tieBreakerMultiplier,·scorers,·scoreMode,·Long.MAX_VALUE));
 ··········}
 
 ··········@Override

shimpeko · 2026-02-03T11:40:42Z

Thank you for the review, and sorry about the repeated force-pushes — I was trying to clean up the commit history and should have avoided doing that during review. I won’t force-push going forward. Ran tidy and pushed 68ada56. Will wait for @jpountz's review.

jpountz · 2026-02-03T13:25:38Z

When all clauses return DefaultBulkScorer, the bulk windowing and replay logic adds overhead while preventing effective use of minCompetitiveScore and block-max optimizations

It's true that the windowing+replay logic isn't free but I remember that it was still better than using a Scorer which had to keep reordering a heap on every document. As far as block-max optimizations are concerned, DisjunctionBulkMaxScorer tracks the min competitive score and passes it to its sub clauses whenever scoring a window (

lucene/lucene/core/src/java/org/apache/lucene/search/DisjunctionMaxBulkScorer.java

Line 81 in 7ebdb93

scorer.setMinCompetitiveScore(topLevelScorable.minCompetitiveScore);

), so this should work fine.

Let's benchmark this change with luceneutil and our dismax tasks (https://github.com/mikemccand/luceneutil/blob/main/tasks/wikinightly.tasks#L257-L276)? DisMaxTerm in particular should be a good task since TermQuery doesn't have a bulk scorer. I haven't played with this for a while, it's possible that bulk scoring isn't helping anymore.

shimpeko · 2026-02-04T12:25:37Z

I tried to run wikinightly but it didn't finish after 10 hours on my laptop so just picked some dismax tasks. Looking at the result, this change (edit: This PR) basically has no effect on performance of those dismax tasks including DismaxTerm. It also means that the bulk scoring is not improving performance of DismaxTerm.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                 DismaxOrHighMed      218.28     (15.2%)      216.35     (14.6%)   -0.9% ( -26% -   34%) 0.767
                      DismaxTerm      633.42     (16.9%)      635.44     (16.2%)    0.3% ( -28% -   40%) 0.923
                        PKLookup      230.86     (14.8%)      232.32     (14.0%)    0.6% ( -24% -   34%) 0.826
                DismaxOrHighHigh      202.17     (15.9%)      205.03     (15.3%)    1.4% ( -25% -   38%) 0.650
         FilteredDismaxOrHighMed      140.01     (11.7%)      142.35     (10.1%)    1.7% ( -18% -   26%) 0.445
        FilteredDismaxOrHighHigh       59.48     (12.1%)       61.13      (8.9%)    2.8% ( -16% -   27%) 0.191
              FilteredDismaxTerm      127.87     (13.5%)      133.05      (8.8%)    4.1% ( -16% -   30%) 0.075

I'll dig around deeper.

Commands to run bench and task detail

util % grep -A 5 'sourceData =' src/python/localrun.py 
  sourceData = competition.Data(
    "wikimediumall",
    constants.WIKI_MEDIUM_DOCS_LINE_FILE,
    constants.WIKI_MEDIUM_DOCS_COUNT,
    constants.DISMAX_TERM_TASKS_FILE,
  )
util % cat src/python/localconstants.py 
BASE_DIR = '/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home'
BENCH_BASE_DIR = '/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/util'
DISMAX_TERM_TASKS_FILE = '%s/tasks/dismax_term_only.tasks' % BENCH_BASE_DIR
util % cd ../lucene_baseline && git show -s --oneline HEAD && cd ../util/
7ebdb9316e5 (HEAD -> main, origin/main, origin/HEAD) Add next minor version 10.5.0
util % cd ../lucene_candidate && git show -s --oneline HEAD && cd ../util 
68ada56464 (HEAD -> dismax-bulk-heuristic, origin/dismax-bulk-heuristic) ./gradlew tidy --rerun-tasks
util % python src/python/localrun.py --iterations=50 --warmups=50 -b /Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_baseline -c /Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_candidate > result.txt
util % cat tasks/dismax_term_only.tasks 
DismaxTerm: 0 +dismaxFields=titleTokenized,body
DismaxTerm: names +dismaxFields=titleTokenized,body
DismaxTerm: nbsp +dismaxFields=titleTokenized,body
DismaxTerm: part +dismaxFields=titleTokenized,body
DismaxTerm: st +dismaxFields=titleTokenized,body

DismaxOrHighHigh: are last +dismaxFields=titleTokenized,body
DismaxOrHighHigh: at united +dismaxFields=titleTokenized,body
DismaxOrHighHigh: but year +dismaxFields=titleTokenized,body
DismaxOrHighHigh: name its +dismaxFields=titleTokenized,body
DismaxOrHighHigh: to but +dismaxFields=titleTokenized,body

DismaxOrHighMed: at mostly +dismaxFields=titleTokenized,body
DismaxOrHighMed: his interview +dismaxFields=titleTokenized,body
DismaxOrHighMed: http 9 +dismaxFields=titleTokenized,body
DismaxOrHighMed: they hard +dismaxFields=titleTokenized,body
DismaxOrHighMed: title bay +dismaxFields=titleTokenized,body

FilteredDismaxTerm: 0 +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxTerm: names +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxTerm: nbsp +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxTerm: part +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxTerm: st +dismaxFields=titleTokenized,body +filter=5%

FilteredDismaxOrHighHigh: are last +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighHigh: at united +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighHigh: but year +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighHigh: name its +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighHigh: to but +dismaxFields=titleTokenized,body +filter=5%

FilteredDismaxOrHighMed: at mostly +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighMed: his interview +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighMed: http 9 +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighMed: they hard +dismaxFields=titleTokenized,body +filter=5%
FilteredDismaxOrHighMed: title bay +dismaxFields=titleTokenized,body +filter=5%
util %

shimpeko · 2026-02-04T18:58:59Z

cont. #15659 (comment)

DismaxTerm task uses BatchScoreBulkScorer so baseline and candidate were using the same path in the benchmark on #15659 (comment).

shimpeko · 2026-02-05T11:31:26Z

I added test case that are closer to my query which has dismax + constant_score as https://github.com/shimpeko/luceneutil/pull/1/changes. Looking at the following benchmark result, I think I can say that the changes on this PR has significant positive impact on the performance of specifc type of query.

As far as block-max optimizations are concerned, DisjunctionBulkMaxScorer tracks the min competitive score and passes it to its sub clauses whenever scoring a window

This seems true, so I don't cleary understand why using DisjunctionBulkMaxScorer causing regression in this paticular case, yet. @jpountz do you have any idea.

Edit: I'll look deeper into DisjunctionBulkMaxScorer but happy if I could get feedbacks/suggestions, for this PR after having the benchmark result.

Result (50 warmups, 50 iter)

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      DismaxTerm     1034.68     (11.6%)      935.85     (11.7%)   -9.6% ( -29% -   15%) 0.000
                 DismaxOrHighMed      224.64     (16.2%)      206.15     (16.7%)   -8.2% ( -35% -   29%) 0.012
              FilteredDismaxTerm      201.44      (9.5%)      189.45     (12.1%)   -5.9% ( -25% -   17%) 0.006
        FilteredDismaxOrHighHigh       61.88     (14.1%)       59.53     (11.3%)   -3.8% ( -25% -   25%) 0.137
                DismaxOrHighHigh       99.12     (13.0%)       95.37     (12.9%)   -3.8% ( -26% -   25%) 0.143
                        PKLookup      240.78     (14.9%)      232.55     (15.0%)   -3.4% ( -29% -   31%) 0.254
         FilteredDismaxOrHighMed      170.69     (14.1%)      166.51     (13.5%)   -2.5% ( -26% -   29%) 0.374
                   DisMaxCsTerm1     1814.59     (12.8%)     2010.12     (16.9%)   10.8% ( -16% -   46%) 0.000
                  DisMaxCSTerm20      169.78     (12.3%)      307.85     (43.2%)   81.3% (  22% -  156%) 0.000

Commands to run bench

util % cd ../lucene_candidate && git show -s --oneline HEAD && cd ../util
68ada56464 (HEAD -> dismax-bulk-heuristic, origin/dismax-bulk-heuristic) ./gradlew tidy --rerun-tasks
util % cd ../lucene_baseline && git show -s --oneline HEAD && cd ../util
7ebdb9316e5 (HEAD -> main, origin/main, origin/HEAD) Add next minor version 10.5.0
util % cd ../lucene_candidate && git show -s --oneline HEAD && cd ../util
68ada56464 (HEAD -> dismax-bulk-heuristic, origin/dismax-bulk-heuristic) ./gradlew tidy --rerun-tasks
util % grep -A 5 'sourceData =' src/python/localrun.py                   
  sourceData = competition.Data(
    "wikimediumall",
    constants.WIKI_MEDIUM_DOCS_LINE_FILE,
    constants.WIKI_MEDIUM_DOCS_COUNT,
    constants.DISMAX_TASKS_FILE,
  )
util % cat src/python/localconstants.py 
import os

BASE_DIR = '/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home'
BENCH_BASE_DIR = '/Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/util'
DISMAX_TASKS_FILE = '%s/tasks/dismax_constantscore.tasks' % BENCH_BASE_DIR

#JAVA_HOME = os.environ.get("JAVA_HOME")
#java_bin = JAVA_HOME + "/bin/" if JAVA_HOME else ""
#if java_bin:
#  print("Using java from: %s" % java_bin)
#if "JAVA_EXE" not in globals():
#  JAVA_EXE = f"{java_bin}java"
#if "JAVAC_EXE" not in globals():
#  JAVAC_EXE = f"{java_bin}javac"
#if "JAVA_COMMAND" not in globals():
#  JAVA_COMMAND = "%s -server -Xms2g -Xmx2g --add-modules jdk.incubator.vector -XX:+HeapDumpOnOutOfMemoryError -XX:+UseParallelGC  -Dlucene.dismax.debug=true" % JAVA_EXE
#else:
#  print("use java command %s" % JAVA_COMMAND)  # pyright: ignore[reportUndefinedVariable] # TODO: fix how variables are managed here
util % python src/python/localrun.py --iteration=50 --warmups=50 -b /Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_baseline -c /Users/shimpei-kodama/github.com/mikemccand/luceneutil/bench_home/lucene_candidate > result.txt

This reverts commit 68ada56.

This reverts commit ccf5143.

The previous DisjunctionMaxBulkScorer uses a fixed 4096-doc window. minCompetitiveScore is only propagated to sub-scorers at window boundaries, so up to 4096 docs may be scored before a new threshold reaches them. For specific types of queries that can eliminate sub-scorers earlier than 4096 evaluation, DisjunctionMaxBulkScorer is slower than the non-bulk DisjunctionMaxScorer. This change update DisjunctionMaxBulkScorer to start with window size 1 and double each iteration, capping at 4096. Small initial windows let minCompetitiveScore propagate frequently while the threshold is rising and sub-scorers are being eliminated. Expected to help most when TOP_SCORES has low N, the query has high match rate, and the disjunction has many clauses. The 13 sub-4096 windows add minor per-window overhead for queries where full-size windows would have been fine.

shimpeko · 2026-02-14T19:02:51Z

@jpountz I’ve dropped the idea of switching from DisjunctionMaxBulkScorer to DisjunctionMaxScorer when all child bulk scorers are DefaultBulkScorer. That approach has been reverted.

Instead, this change (fe86ceb) now makes DisjunctionMaxBulkScorer start with a window size of 1 and grow it exponentially.

As you noted in #15659 (comment), “DisjunctionMaxBulkScorer tracks the min competitive score and passes it to its sub clauses”. However, propagating the min competitive score only at fixed 4096-doc boundaries is sometimes too slow for queries such as "constant_score + top-N (with small N) + high match rate". In these cases, the delayed propagation causes a regression compared to DisjunctionMaxScorer, which propagates the min score on a per-doc basis.

The new approach addresses that problem by exponentially increasing the window size from 1 up to 4096, ensuring that min competitive score propagation happens early enough, while still preserving the throughput advantages of bulk scoring. I believe this approach is more robust than relying on scorer-type checks such as if (bs instanceof Weight.DefaultBulkScorer dbs).

Below are the benchmark result, which demonstrates the improvement:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                        PKLookup      240.92     (16.3%)      242.22     (15.1%)    0.5% ( -26% -   38%) 0.732
                DismaxOrHighHigh      264.57     (15.7%)      270.59     (15.1%)    2.3% ( -24% -   39%) 0.141
                 DismaxOrHighMed      214.97     (16.1%)      219.96     (15.0%)    2.3% ( -24% -   39%) 0.135
              FilteredDismaxTerm      130.87     (13.0%)      134.62     (11.0%)    2.9% ( -18% -   30%) 0.018
        FilteredDismaxOrHighHigh       58.86     (16.1%)       60.69     (11.9%)    3.1% ( -21% -   37%) 0.028
         FilteredDismaxOrHighMed      141.60     (14.2%)      147.28     (12.0%)    4.0% ( -19% -   35%) 0.002
                      DismaxTerm      794.99     (16.2%)      831.28     (14.1%)    4.6% ( -22% -   41%) 0.003
                        CSTerm20      560.08     (26.9%)      596.16     (22.1%)    6.4% ( -33% -   75%) 0.009
                   DisMaxCsTerm1     1400.32     (14.2%)     1974.02     (14.6%)   41.0% (  10% -   81%) 0.000
                  DisMaxCSTerm20      205.76     (18.9%)      322.79     (29.3%)   56.9% (   7% -  129%) 0.000

baseline: 16f83c9
my_modified_version: fe86ceb
--iterations=200 --warmups=50

As mentioned in #15659 (comment), I added test cases that exercise a dismax + constant_score structure in luceneutil: https://github.com/shimpeko/luceneutil/pull/1/changes.

github-actions · 2026-03-13T00:35:04Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

benwtrent · 2026-03-19T13:18:22Z

I really like the simplicity of the change, and it gives obvious improvements with very little detriment.

@shimpeko do you mind moving your "CHANGES" entry to 10.5? Once you do, I think we can move towards merging this unless @uschindler or @jpountz objects.

github-actions · 2026-04-03T00:40:17Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

shimpeko force-pushed the dismax-bulk-heuristic branch from 56cf380 to ecc05df Compare February 2, 2026 22:43

github-actions Bot added the module:core/search label Feb 2, 2026

github-actions Bot modified the milestones: 11.0.0, 10.4.0 Feb 2, 2026

shimpeko marked this pull request as draft February 3, 2026 00:54

shimpeko force-pushed the dismax-bulk-heuristic branch 2 times, most recently from a3b772a to 160a235 Compare February 3, 2026 01:48

uschindler requested a review from jpountz February 3, 2026 09:46

shimpeko added 2 commits February 3, 2026 10:05

Add changes.txt

6231d37

shimpeko force-pushed the dismax-bulk-heuristic branch from 160a235 to 6231d37 Compare February 3, 2026 10:07

shimpeko marked this pull request as ready for review February 3, 2026 10:08

uschindler approved these changes Feb 3, 2026

View reviewed changes

./gradlew tidy --rerun-tasks

68ada56

benwtrent modified the milestones: 10.4.0, 10.5.0 Feb 6, 2026

shimpeko added 3 commits February 14, 2026 13:39

Revert "./gradlew tidy --rerun-tasks"

d828ce2

This reverts commit 68ada56.

Revert "Avoid unnecessary DisjunctionMaxBulkScorer overhead"

96ae185

This reverts commit ccf5143.

github-actions Bot removed this from the 10.5.0 milestone Feb 14, 2026

github-actions Bot added this to the 10.4.0 milestone Feb 14, 2026

benwtrent modified the milestones: 10.4.0, 10.5.0 Feb 26, 2026

github-actions Bot added the Stale label Mar 13, 2026

github-actions Bot removed the Stale label Mar 20, 2026

github-actions Bot added the Stale label Apr 3, 2026

Conversation

shimpeko commented Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

shimpeko commented Feb 3, 2026

Uh oh!

uschindler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uschindler commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz commented Feb 3, 2026

Uh oh!

shimpeko commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shimpeko commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Mar 13, 2026

Uh oh!

benwtrent commented Mar 19, 2026

Uh oh!

github-actions Bot commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shimpeko commented Feb 2, 2026 •

edited

Loading

uschindler commented Feb 3, 2026 •

edited

Loading

uschindler commented Feb 3, 2026 •

edited

Loading

shimpeko commented Feb 3, 2026 •

edited

Loading

shimpeko commented Feb 4, 2026 •

edited

Loading

shimpeko commented Feb 4, 2026 •

edited

Loading

shimpeko commented Feb 5, 2026 •

edited

Loading

shimpeko commented Feb 14, 2026 •

edited

Loading