Skip to content

fix: prevent duplicate lineage cleanup and remove blocking refresh in PATCH /api/v1/tables/{id}#26702

Open
kojikokojiko wants to merge 1 commit intoopen-metadata:mainfrom
kojikokojiko:fix/patch-table-slow-lineage-cleanup
Open

fix: prevent duplicate lineage cleanup and remove blocking refresh in PATCH /api/v1/tables/{id}#26702
kojikokojiko wants to merge 1 commit intoopen-metadata:mainfrom
kojikokojiko:fix/patch-table-slow-lineage-cleanup

Conversation

@kojikokojiko
Copy link
Contributor

Fixes #26674

Root Cause

Two issues caused PATCH /api/v1/tables/{id} to be extremely slow (49–99s) when a database service had many lineage entries:

Issue 1: Duplicate deferred operations during session consolidation

When the same user edits the same entity within 10 minutes, session change consolidation is triggered. In this path, updateInternal() is called multiple times:

  • once in incrementalChange() for incremental diff calculation
  • once or twice in revert() for consolidation
  • once in the final flush phase

Each call traversed entitySpecificUpdate()updateColumns()handleColumnLineageUpdates()deferReactOperation(), registering deleteColumnsInUpstreamLineage as a deferred operation 2–4 times per PATCH request.

Fix: Add a deferOpsEnabled flag to EntityUpdater that is set to false during intermediate calculations (incrementalChange and revert). Deferred operations are only registered during the final updateInternal call.

Issue 2: Blocking refresh in updateByQuery

deleteColumnsInUpstreamLineage and updateColumnsInUpstreamLineage used Refresh.True in updateByQuery, which blocks until all shards have refreshed after updating potentially thousands of documents. With 13,275 matching documents, this caused ~90s of blocking per call.

Fix: Remove the refresh parameter, allowing ES/OpenSearch to refresh on its normal schedule (~1s), which is sufficient for lineage cleanup.

Verification

Confirmed with local testing:

Duplicate call fix: Server logs showed deleteColumnsInUpstreamLineage executing 3 times per PATCH with consolidation before the fix, and 1 time after.

Refresh fix: Direct ES measurement with 1,000 documents:

Time
refresh=true (before) 0.245s
no refresh (after) 0.124s
Improvement ~50% reduction

Combined effect (3x duplicate calls × blocking refresh): expected >80% reduction in PATCH latency for tables with many lineage entries.

Test plan

  • Run ingestion workflow on a database service with 700+ lineage entries and verify PATCH /api/v1/tables/{id} completes in reasonable time
  • Verify server logs show deleteColumnsInUpstreamLineage appears only once per PATCH request even with session consolidation active
  • Verify lineage data is eventually consistent in search after column deletion (within ~1s)

🤖 Generated with Claude Code

@github-actions
Copy link
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@kojikokojiko kojikokojiko force-pushed the fix/patch-table-slow-lineage-cleanup branch from 1178d93 to ef589a8 Compare March 23, 2026 18:26
@github-actions
Copy link
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@kojikokojiko kojikokojiko force-pushed the fix/patch-table-slow-lineage-cleanup branch from ef589a8 to ed4f0ea Compare March 24, 2026 06:11
@github-actions
Copy link
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

…on and remove blocking refresh in lineage cleanup

Fixes open-metadata#26674

Two issues caused PATCH /api/v1/tables/{id} to be extremely slow (49-99s)
when a database service had many lineage entries:

1. During session change consolidation (when the same user edits the same
   entity within 10 minutes), updateInternal() was called multiple times:
   - once in incrementalChange() for incremental diff calculation
   - once or twice in revert() for consolidation
   - once in the final flush phase
   Each call registered deleteColumnsInUpstreamLineage as a deferred
   operation, causing it to execute 2-4 times per PATCH request.
   Fix: add deferOpsEnabled flag to EntityUpdater that is set to false
   during intermediate calculations (incrementalChange and revert), so
   deferred operations are only registered during the final updateInternal.

2. updateByQuery in deleteColumnsInUpstreamLineage and
   updateColumnsInUpstreamLineage used Refresh.True, which blocks until
   all shards have refreshed after updating potentially thousands of
   documents. This caused ~90s of blocking per call.
   Fix: remove the refresh parameter, allowing ES/OpenSearch to refresh
   on its normal schedule (within ~1s), which is sufficient for lineage
   cleanup.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kojikokojiko kojikokojiko force-pushed the fix/patch-table-slow-lineage-cleanup branch from ed4f0ea to fd125db Compare March 24, 2026 06:22
@github-actions
Copy link
Contributor

Hi there 👋 Thanks for your contribution!

The OpenMetadata team will review the PR shortly! Once it has been labeled as safe to test, the CI workflows
will start executing and we'll be able to make sure everything is working as expected.

Let us know if you need any help!

@gitar-bot
Copy link

gitar-bot bot commented Mar 24, 2026

Code Review ✅ Approved

Fixes duplicate lineage cleanup and removes blocking refresh in PATCH /api/v1/tables/{id} to improve performance. No issues found.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG][Performance Issue] PATCH /api/v1/tables/{id} extremely slow (49–99s) during Database Service ingestion workflow

1 participant