Skip to content

Fix race condition in async event consumers on rapid create-delete#26718

Merged
TeddyCr merged 2 commits intomainfrom
fix/graceful-handling-entity-deleted-race-condition
Mar 24, 2026
Merged

Fix race condition in async event consumers on rapid create-delete#26718
TeddyCr merged 2 commits intomainfrom
fix/graceful-handling-entity-deleted-race-condition

Conversation

@manerow
Copy link
Contributor

@manerow manerow commented Mar 24, 2026

Fixes #26690

Summary

  • When an entity is created and immediately hard-deleted, async event consumers (WorkflowEventConsumer, ActivityFeedPublisher) throw EntityNotFoundException because the entity is gone by the time they process the create event
  • This caused ERROR-level logs and permanent failed-event records retried indefinitely
  • Added EntityNotFoundException catch before the generic Exception catch in sendMessage() of both consumers — the event is now skipped gracefully at DEBUG level with no failed event recording

Test plan

  • New unit test testSendMessage_SkipsGracefullyWhenEntityDeleted verifies EntityNotFoundException does not throw EventPublisherException
  • All 20 existing WorkflowEventConsumerTest tests pass

@manerow manerow self-assigned this Mar 24, 2026
@manerow manerow added safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch backend labels Mar 24, 2026
@manerow manerow force-pushed the fix/graceful-handling-entity-deleted-race-condition branch from cb31bf0 to a1b7130 Compare March 24, 2026 09:43
@github-actions
Copy link
Contributor

github-actions bot commented Mar 24, 2026

OpenMetadata Service New-Code Coverage

PASS. Required changed-line coverage: 90.00% overall and per touched production file.

  • Overall executable changed lines: 8/8 covered (100.00%)
  • Missed executable changed lines: 0
  • Non-executable changed lines ignored by JaCoCo: 6
  • Changed production files: 2
File Covered Missed Executable Non-exec Coverage Uncovered lines
openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/changeEvent/feed/ActivityFeedPublisher.java 5 0 5 2 100.00% -
openmetadata-service/src/main/java/org/openmetadata/service/governance/workflows/WorkflowEventConsumer.java 3 0 3 4 100.00% -

Only changed executable lines under openmetadata-service/src/main/java are counted. Test files, comments, imports, and non-executable lines are excluded.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 24, 2026

🟡 Playwright Results — all passed (18 flaky)

✅ 3112 passed · ❌ 0 failed · 🟡 18 flaky · ⏭️ 207 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 452 0 3 2
🟡 Shard 2 542 0 4 33
🟡 Shard 3 541 0 5 27
🟡 Shard 4 533 0 5 45
✅ Shard 5 511 0 0 67
🟡 Shard 6 533 0 1 33
🟡 18 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Database Schema - customization should work (shard 1, 1 retry)
  • Flow/Tour.spec.ts › Tour should work from URL directly (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/DataQuality/TestCaseIncidentPermissions.spec.ts › User with TEST_CASE.EDIT_ALL can see edit icon on incidents (shard 2, 1 retry)
  • Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 2, 1 retry)
  • Features/ImpactAnalysis.spec.ts › Verify column level upstream connections (shard 2, 1 retry)
  • Features/LandingPageWidgets/FollowingWidget.spec.ts › Check followed entity present in following widget (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 3, 1 retry)
  • Pages/DataProductAndSubdomains.spec.ts › Add assets to data product and verify count (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Multiple consecutive domain renames preserve all associations (shard 4, 1 retry)
  • Pages/DomainUIInteractions.spec.ts › Select domain from global dropdown filters explore (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tag Add, Update and Remove for child entities (shard 4, 1 retry)
  • Pages/EntityDataConsumer.spec.ts › Tier Add, Update and Remove (shard 4, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@gitar-bot
Copy link

gitar-bot bot commented Mar 24, 2026

Code Review ✅ Approved 1 resolved / 1 findings

Fixes race condition in async event consumers during rapid create-delete cycles by adding EntityNotFoundException safety handling with comprehensive test coverage. No issues found.

✅ 1 resolved
Quality: No test for ActivityFeedPublisher EntityNotFoundException handling

📄 openmetadata-service/src/main/java/org/openmetadata/service/apps/bundles/changeEvent/feed/ActivityFeedPublisher.java:74-79
The PR adds an EntityNotFoundException catch to both WorkflowEventConsumer.sendMessage() and ActivityFeedPublisher.sendMessage(), but only WorkflowEventConsumer has a corresponding test (testSendMessage_SkipsGracefullyWhenEntityDeleted). There is no test class for ActivityFeedPublisher at all, so the new catch block in that class is untested.

Since both consumers apply the identical fix pattern and the WorkflowEventConsumer path is well-tested, this is low risk — but adding a parallel test for ActivityFeedPublisher would ensure both code paths are covered.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

@TeddyCr TeddyCr merged commit d9aaeef into main Mar 24, 2026
47 checks passed
@TeddyCr TeddyCr deleted the fix/graceful-handling-entity-deleted-race-condition branch March 24, 2026 18:00
@github-actions
Copy link
Contributor

Changes have been cherry-picked to the 1.12.4 branch.

github-actions bot pushed a commit that referenced this pull request Mar 24, 2026
…26718)

* Fix race condition in async event consumers on rapid create-delete

* Add tests for EntityNotFoundException safety net in event consumers

(cherry picked from commit d9aaeef)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs To release Will cherry-pick this PR into the release branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Race condition in ActivityFeedPublisher and WorkflowEventConsumer when entity is deleted shortly after creation

2 participants