Skip to content

Fix Default Recreate handler#25746

Open
mohityadav766 wants to merge 1 commit intomainfrom
fix-default-recreat
Open

Fix Default Recreate handler#25746
mohityadav766 wants to merge 1 commit intomainfrom
fix-default-recreat

Conversation

@mohityadav766
Copy link
Member

@mohityadav766 mohityadav766 commented Feb 8, 2026

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Fixed alias name conflict during search index recreation: Added cleanup logic to delete canonical indices before alias swap operations in DefaultRecreateHandler
  • Prevents Elasticsearch/OpenSearch errors: Canonical index names (e.g., table_search_index) must be removed as physical indices before being attached as aliases to timestamped indices
  • Applied to both code paths: Updated finalizeReindex() and promoteEntityIndex() methods to ensure consistent behavior during batch and single-entity index recreation
  • Maintains zero-downtime pattern: Deletion occurs before atomic alias swap, ensuring queries continue to work throughout the recreation process

@gitar-bot
Copy link

gitar-bot bot commented Feb 8, 2026

🔍 CI failure analysis for 016ac85: Multiple CI failures across 11 jobs: 2 infrastructure, 3 Playwright with flaky tests, 2 Python with timing issues, 4 Maven/Test Report jobs with 3 different test failures - all unrelated to backend search index PR changes.

Issue

Multiple CI jobs failed with different types of failures:

  1. Integration test jobs - Infrastructure errors
  2. Playwright test jobs - Multiple shards with flaky tests
  3. Python test jobs - Both Python 3.10 and 3.11
  4. Maven PostgreSQL CI + Test Report - 2 test failures in FeedResourceTest
  5. Maven SonarCloud CI + Test Report - 1 test failure in GlossaryTermResourceTest

Root Cause

Integration Tests

GitHub Actions runner disk space exhaustion: CI runners ran out of disk space.

Playwright Tests - 3 Shards

  • Total: 3 failed, 19 flaky, 1143 passed (97.8% pass rate)
  • Failures in custom properties, domain operations, metrics, service forms, data asset rules

Python Tests - Identical Across Both Versions

  • Python 3.11: 7 errors, 530 passed (98.7%)
  • Python 3.10: 7 errors, 530 passed (98.7%)
  • Error: "Could not fetch database entity from Search Indexes"
  • Identical failures confirm test environment timing issue, not code defect

Maven PostgreSQL CI + Test Report

2 test failures in FeedResourceTest (7919 tests run, 99.97% pass rate):

  1. list_threadsWithMentionsFilter:1326 - expected: <32> but was: <3>
  2. post_validThreadAndList_200:326 - expected: <6> but was: <8>

Analysis: Feed/Thread resource tests with count mismatches - tests Feeds API functionality unrelated to search index management.

Maven SonarCloud CI + Test Report

1 test failure in GlossaryTermResourceTest (7919 tests run, 99.99% pass rate):

  • test_GlossaryTermWorkflow_TermHasReviewers_UpdatedByNotReviewer_WorkflowTriggered:2290
  • Error: "Term should be approved after task resolution ==> expected: but was: "

Analysis: Glossary Term workflow test verifying state transitions during approval. Failure suggests timing issues with workflow task resolution or async state updates. Glossary term workflow functionality is completely unrelated to search index alias management.

Details

Relationship to PR Changes: This PR modifies DefaultRecreateHandler.java to fix Elasticsearch/OpenSearch alias conflicts during index recreation. All test failures are in functionality completely separate from search index alias management:

  • Feed/Thread resources (activity feeds)
  • Glossary Term workflow (term approval processes)
  • Frontend UI components
  • Python metadata ingestion

Overall CI Status

  • Integration: 2/2 failed (infrastructure)
  • Playwright: 3 failed, 19 flaky, 1143 passed (97.8%)
  • Python (3.10 & 3.11): 7 errors each, 530 passed (98.7%)
  • Maven PostgreSQL: 2 failures, 7919 passed (99.97%)
  • Maven SonarCloud: 1 failure, 7919 passed (99.99%)
  • PR change: 18 lines in single Java file
Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Correct fix for the alias name conflict bug during index recreation. One minor suggestion about the brief downtime window and error handling around the new deletion step.

💡 Edge Case: Brief search downtime window during canonical index deletion

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/DefaultRecreateHandler.java:91

Between deleteIndexWithBackoff(canonicalIndex) and the completion of swapAliases(...), queries targeting the canonical index name will fail because the physical index is deleted but the alias hasn't been created yet. The code comments claim "zero-downtime" but this introduces a brief unavailability window.

This is likely an acceptable tradeoff (the alternative is a complete alias swap failure), but the code comment above ("zero-downtime") should be updated to reflect the reality, e.g.:

// Canonical index must be removed before it can be used as an alias name.
// Note: This creates a brief window where the canonical name is unavailable.

Also, consider wrapping the deletion in its own try-catch so that a failure here still allows falling back to the existing behavior (even if the alias swap will likely fail too):

if (oldIndicesToDelete.contains(canonicalIndex)) {
  if (searchClient.indexExists(canonicalIndex)) {
    try {
      searchClient.deleteIndexWithBackoff(canonicalIndex);
      oldIndicesToDelete.remove(canonicalIndex);
      LOG.info("Cleaned up old index '{}' for entity '{}'.", canonicalIndex, entityType);
    } catch (Exception ex) {
      LOG.warn("Failed to clean up canonical index '{}' before alias swap for entity '{}'.", canonicalIndex, entityType, ex);
    }
  }
}

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant