Skip to content

fix: correct inverted condition in DocLevelMonitorQueries causing query index churn#2154

Open
thecodingshrimp wants to merge 1 commit into
opensearch-project:mainfrom
thecodingshrimp:fix/doc-level-query-index-inverted-condition
Open

fix: correct inverted condition in DocLevelMonitorQueries causing query index churn#2154
thecodingshrimp wants to merge 1 commit into
opensearch-project:mainfrom
thecodingshrimp:fix/doc-level-query-index-inverted-condition

Conversation

@thecodingshrimp
Copy link
Copy Markdown

Summary

Fixes #2153

The condition in DocLevelMonitorQueries.updateQueryIndexMappings that decides when to re-fetch the write index from the query index alias was inverted (!= instead of ==). This caused the re-fetch path to fire on every monitor execution rather than only in the backwards-compatibility case it was designed to handle.

Root Cause

monitorMetadata.sourceToQueryIndexMapping stores the concrete backing index name (e.g. .opensearch-sap-pre-packaged-rules-queries-000001) after the first successful lookup. The condition was intended to re-resolve the alias only when the stored value is the alias name itself — a legacy situation where the metadata was written before this logic existed.

With !=, the condition evaluates to true on every subsequent run because the stored concrete index name is always different from the alias name. This triggers getWriteIndexNameForAlias and a metadata rewrite on every run, which cascades into a delete+recreate of the backing query index, generating 6–10 MergeSchedulerConfig log lines per node per cycle.

With ==, the condition correctly fires only when the alias name itself is stored, which is the backwards-compat path it was designed for.

Fix

// Before (broken)
targetQueryIndex != monitor.dataSources.queryIndex && monitor.deleteQueryIndexInEveryRun == true

// After (correct)
targetQueryIndex == monitor.dataSources.queryIndex && monitor.deleteQueryIndexInEveryRun == true

The comment block was also updated to document the correct invariant.

Impact

At scale (3 master nodes, many detectors), the broken condition produced 23,000+ MergeSchedulerConfig log lines per minute. This fix eliminates that churn.

Related

This bug is amplified by a separate issue in opensearch-project/security-analytics where chained_findings monitors are created with deleteQueryIndexInEveryRun=true, activating the broken code path on every detector run. Both fixes are needed to fully eliminate the log storm. The security-analytics fix prevents the flag from being set unnecessarily; this fix corrects the inverted logic that acts on it.

…ry index churn (opensearch-project#2153)

Signed-off-by: thecodingshrimp <leonard.stutzer@sap.com>
@thecodingshrimp thecodingshrimp force-pushed the fix/doc-level-query-index-inverted-condition branch from 55734fa to 289f661 Compare May 21, 2026 15:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] DocLevelMonitorQueries: inverted condition causes query index to be deleted and recreated on every monitor execution

1 participant