Skip to content

Fixes #4003: bulk + async restore for large entity hierarchies#27997

Open
harshach wants to merge 2 commits intomainfrom
harshach/review-issue-4004
Open

Fixes #4003: bulk + async restore for large entity hierarchies#27997
harshach wants to merge 2 commits intomainfrom
harshach/review-issue-4004

Conversation

@harshach
Copy link
Copy Markdown
Collaborator

@harshach harshach commented May 8, 2026

Describe your changes:

Fixes #4003

EntityRepository.restoreEntity walked the descendant tree synchronously inside one HTTP request — on a database with 12k tables / 1.1k stored procedures the customer reported 4m37s response times that exceeded typical proxy / ALB idle timeouts. This PR replaces the per-entity recursion with an iterative bulk path and adds a server-side async option for hierarchies large enough that the synchronous response would still hit timeouts.

Type of change:

  • Bug fix
  • Improvement

High-level design:

Bulk synchronous path. restoreChildren now groups CONTAINS children by entity type and dispatches a single bulkRestoreSubtree(ids, updatedBy) per type rather than looping Entity.restoreEntity per descendant. bulkRestoreSubtree reuses the existing deferred-store bulk infrastructure (updateMany, entityExtensionDAO.insertMany, invalidateMany, insertChangeEventsBatch) for one batched DB write per type. Per-descendant ES writes are skipped because restoreFromSearch already cascades the deleted-flag flip across child indexes in one ES update_by_query. Subclass extension hook restoreAdditionalChildren(UUID, String) is provided for repos that link non-CONTAINS related entities.

Async option. New EntityResource.restoreEntityAsync(...) returns 202 Accepted with a RestoreEntityResponse(jobId, message) and runs the restore on AsyncService (virtual-thread executor with semaphore-bounded concurrency); completion / failure is broadcast on the new RESTORE_ENTITY_CHANNEL WebSocket channel. The restoreEntity(uri, sec, id, async) overload is wired into the deep-hierarchy endpoints (Database, DatabaseSchema, Table, StoredProcedure, Dashboard, DashboardDataModel, Container, DatabaseService) via ?async=true.

SDK updates. Java SDK adds an AsyncJobResponse model and EntityServiceBase.restoreServerAsync(id); new fluent builders on Tables and Databases give Tables.find(id).restore().execute() (sync, returns Table) and Tables.find(id).restore().async().execute() (returns AsyncJobResponse) with type-safe terminal operations. Python SDK mirrors this with Tables.restore_async(id) and the fluent Tables.restore_request(id).with_async().execute() plus an AsyncJobResponse dataclass.

Tests:

Use cases covered

  • Synchronously restore a Database with 3 schemas / 12 tables and verify all descendants come back un-deleted.
  • Async-restore the same hierarchy via ?async=true, observe 202 + jobId, then poll until the database flips to deleted=false.
  • restoreChildren correctly groups mixed-type children (DatabaseSchema + StoredProcedure) and dispatches one bulkRestoreSubtree per type with no per-entity Entity.restoreEntity calls.
  • bulkRestoreSubtree is a no-op for null / empty IDs and when no deleted entities are found.
  • Java SDK Tables.find(id).restore().async().execute() and Databases.find(id).restore().async().execute() route to restoreServerAsync; without .async() they go to the sync path.
  • Python SDK Tables.restore_async(id) appends ?async=true, returns an AsyncJobResponse; the fluent restore_request(id).with_async().execute() does the same.

Unit tests

  • I added unit tests for the new/changed logic.
  • Files added/updated:
    • openmetadata-service/src/test/java/.../jdbi3/EntityRepositoryRestoreTest.java
    • openmetadata-sdk/src/test/java/.../fluent/RestoreFluentAPITest.java
    • ingestion/tests/unit/sdk/test_restore_async.py

Backend integration tests

  • I added integration tests in openmetadata-integration-tests/ for new/changed API endpoints.
  • Files added/updated:
    • openmetadata-integration-tests/src/test/java/.../tests/RestoreHierarchyIT.java

Ingestion integration tests

  • Not applicable (no ingestion changes).

Playwright (UI) tests

  • Not applicable (no UI changes).

Manual testing performed

  1. mvn -pl openmetadata-service,openmetadata-sdk -DskipTests compile — clean build.
  2. mvn -pl openmetadata-integration-tests test-compile — clean build.
  3. mvn -pl openmetadata-service test -Dtest=EntityRepositoryRestoreTest — 4/4 pass.
  4. mvn -pl openmetadata-sdk test -Dtest=RestoreFluentAPITest — 4/4 pass.

UI screen recording / screenshots:

Not applicable.

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • My PR is linked to a GitHub issue via Fixes #<issue-number> above.
  • I have added tests (unit / integration as applicable) and listed them above.
  • I have added a test that covers the exact scenario we are fixing (RestoreHierarchyIT builds the multi-level hierarchy and exercises both the bulk-sync and async paths).

🤖 Generated with Claude Code

EntityRepository.restoreEntity walked descendants synchronously, taking
4+ minutes on a 12k-table database and exceeding typical proxy timeouts.
restoreChildren now groups CONTAINS children by type and dispatches one
bulkRestoreSubtree per type, batching DB writes, version history,
change events, and cache invalidation; the existing ES cascade handles
descendant index updates in one update_by_query.

Adds an async option (?async=true) on the deep-hierarchy restore
endpoints that returns 202 Accepted with a job id and runs the restore
on AsyncService, emitting WebSocket notifications on
restoreEntityChannel. Java SDK adds .restore().async().execute() fluent
builders on Tables/Databases plus restoreServerAsync on
EntityServiceBase; Python SDK mirrors this with
restore_request().with_async().execute() and restore_async() helpers
on BaseEntity, exposing a new AsyncJobResponse type.

Tests: EntityRepositoryRestoreTest verifies the per-type grouping and
bulk dispatch path; RestoreFluentAPITest covers the Java SDK fluent
behavior; RestoreHierarchyIT exercises sync and async restore against a
real DB→schemas→tables tree end-to-end; test_restore_async.py covers
the Python SDK paths.

Fixes #4003

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 8, 2026 18:18
@harshach harshach requested a review from a team as a code owner May 8, 2026 18:18
@github-actions github-actions Bot added backend safe to test Add this label to run secure Github workflows on PRs labels May 8, 2026
return;
}
repository.restoreFromSearch(response.getEntity());
addHref(uriInfo, response.getEntity());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: uriInfo used in async lambda after request scope ends

In restoreEntityAsync, the request-scoped uriInfo object is captured by the lambda submitted to the executor (line 822: addHref(uriInfo, response.getEntity())). After the HTTP response is sent (202 Accepted), the JAX-RS container may invalidate or recycle the UriInfo instance, leading to IllegalStateException or incorrect URL generation when addHref calls uriInfo.getBaseUri().

The existing deleteByIdAsync method does NOT use uriInfo inside its async lambda, confirming this is an inconsistency introduced by this PR. The addHref call is non-essential for the async path since the restored entity is only sent via WebSocket notification (not as an HTTP response body), and the WebSocket message uses entity.getName() not HREFs.

Suggested fix:

Remove the `addHref` call from the async lambda. The WebSocket notification only uses `entity.getName()`, so HREFs are unnecessary:

              try {
                PutResponse<T> response = repository.restoreEntity(userName, id);
                if (response == null) {
                  // ... existing error handling
                  return;
                }
                repository.restoreFromSearch(response.getEntity());
-               addHref(uriInfo, response.getEntity());
                LOG.info(...);
                WebsocketNotificationHandler.sendRestoreOperationCompleteNotification(
                    jobId, securityContext, response.getEntity());

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +389 to +403
public static class TableRestorer {
private final OpenMetadataClient client;
private final String id;

public TableRestorer(OpenMetadataClient client, String id) {
this.client = client;
this.id = id;
}

public AsyncTableRestorer async() {
return new AsyncTableRestorer(client, id);
}

public Table execute() {
try {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Fluent restorer classes duplicate identical pattern across entities

The TableRestorer/AsyncTableRestorer and DatabaseRestorer/AsyncDatabaseRestorer classes are structurally identical — only the entity type and service accessor differ. This will grow linearly as more entity types gain the async restore feature (the PR already lists 8 resources). Consider extracting a generic EntityRestorer<T> to avoid duplication per the project's 'No Duplication' principle.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses restore timeouts for large entity hierarchies by replacing per-entity recursive restores with a bulk restore path, and adds an optional server-side async restore mode that returns 202 Accepted and notifies completion via WebSockets. It also updates the Java and Python SDKs to expose the async restore option and adds unit/integration tests covering the new behavior.

Changes:

  • Implement bulk subtree restore (bulkRestoreSubtree) and type-grouped child restore dispatch to reduce DB/ES work per restore.
  • Add ?async=true restore option on deep-hierarchy endpoints, backed by AsyncService and a new restoreEntityChannel WebSocket notification channel.
  • Update Java/Python SDKs with async restore helpers + fluent builders, and add unit/integration tests for sync/async restore flows.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java Adds grouped restoreChildren and new bulkRestoreSubtree bulk restore implementation plus extension hook.
openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java Adds restoreEntity(..., async) overload and server-side async restore execution + 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/util/WebsocketNotificationHandler.java Adds restore completion/failure WebSocket notification helpers.
openmetadata-service/src/main/java/org/openmetadata/service/socket/WebSocketManager.java Introduces RESTORE_ENTITY_CHANNEL constant.
openmetadata-service/src/main/java/org/openmetadata/service/util/RestoreEntityResponse.java New 202 response DTO for async restore requests.
openmetadata-service/src/main/java/org/openmetadata/service/util/RestoreEntityMessage.java New WebSocket payload DTO for restore status notifications.
openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseResource.java Wires ?async=true into database restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseSchemaResource.java Wires ?async=true into schema restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/TableResource.java Wires ?async=true into table restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/StoredProcedureResource.java Wires ?async=true into stored procedure restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/services/database/DatabaseServiceResource.java Wires ?async=true into database service restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/dashboards/DashboardResource.java Wires ?async=true into dashboard restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/datamodels/DashboardDataModelResource.java Wires ?async=true into data model restore endpoint + OpenAPI 202 response.
openmetadata-service/src/main/java/org/openmetadata/service/resources/storages/ContainerResource.java Wires ?async=true into container restore endpoint + OpenAPI 202 response.
openmetadata-sdk/src/main/java/org/openmetadata/sdk/services/EntityServiceBase.java Adds restoreServerAsync(...) calling PUT /restore?async=true.
openmetadata-sdk/src/main/java/org/openmetadata/sdk/models/AsyncJobResponse.java New Java SDK model for 202 async job response.
openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Tables.java Adds fluent restore builder with .async().execute() for server-side async restore.
openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Databases.java Adds fluent restore builder with .async().execute() for server-side async restore.
ingestion/src/metadata/sdk/entities/base.py Adds Python AsyncJobResponse + restore fluent operation + restore_async.
openmetadata-service/src/test/java/org/openmetadata/service/jdbi3/EntityRepositoryRestoreTest.java Unit tests for grouped restore children + bulk restore no-op behaviors.
openmetadata-sdk/src/test/java/org/openmetadata/sdk/fluent/RestoreFluentAPITest.java Unit tests validating fluent restore routes to sync vs server-async SDK calls.
ingestion/tests/unit/sdk/test_restore_async.py Python unit tests for restore_async and fluent restore request behavior.
openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/RestoreHierarchyIT.java Integration test validating sync restore and async restore (202 + eventual restored state).
Comments suppressed due to low confidence (1)

openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:790

  • repository.restoreEntity(...) can return null when the entity is not in the deleted state (see EntityRepository.restoreEntity catch). The sync restore path immediately dereferences response.getEntity() which will throw NPE and return 500. Handle the null case explicitly (e.g., return 404/400 with a clear message) before calling restoreFromSearch/addHref.
    OperationContext operationContext =
        new OperationContext(entityType, MetadataOperation.EDIT_ALL);
    authorizer.authorize(securityContext, operationContext, getResourceContextById(id));
    PutResponse<T> response =
        repository.restoreEntity(securityContext.getUserPrincipal().getName(), id);
    repository.restoreFromSearch(response.getEntity());
    addHref(uriInfo, response.getEntity());
    LOG.info(

Comment on lines +804 to +842
public Response restoreEntityAsync(UriInfo uriInfo, SecurityContext securityContext, UUID id) {
OperationContext operationContext =
new OperationContext(entityType, MetadataOperation.EDIT_ALL);
authorizer.authorize(securityContext, operationContext, getResourceContextById(id));
String jobId = UUID.randomUUID().toString();
String userName = securityContext.getUserPrincipal().getName();
ExecutorService executorService = AsyncService.getInstance().getExecutorService();
executorService.submit(
RequestLatencyContext.wrapWithContext(
() -> {
try {
PutResponse<T> response = repository.restoreEntity(userName, id);
if (response == null) {
WebsocketNotificationHandler.sendRestoreOperationFailedNotification(
jobId, securityContext, id.toString(), "Entity is not in deleted state");
return;
}
repository.restoreFromSearch(response.getEntity());
addHref(uriInfo, response.getEntity());
LOG.info(
"[AsyncRestore] Restored {}:{} (jobId={})",
Entity.getEntityTypeFromObject(response.getEntity()),
response.getEntity().getId(),
jobId);
WebsocketNotificationHandler.sendRestoreOperationCompleteNotification(
jobId, securityContext, response.getEntity());
} catch (Exception e) {
LOG.error("[AsyncRestore] Failed to restore {}:{}", entityType, id, e);
WebsocketNotificationHandler.sendRestoreOperationFailedNotification(
jobId,
securityContext,
id.toString(),
e.getMessage() == null ? e.toString() : e.getMessage());
}
}));
RestoreEntityResponse response =
new RestoreEntityResponse(jobId, "Restore initiated successfully.");
return Response.accepted().entity(response).type(MediaType.APPLICATION_JSON).build();
}
Comment on lines +402 to +426
public Table execute() {
try {
return client.tables().restore(id);
} catch (org.openmetadata.sdk.exceptions.OpenMetadataException e) {
throw new RuntimeException(e);
}
}
}

public static class AsyncTableRestorer {
private final OpenMetadataClient client;
private final String id;

public AsyncTableRestorer(OpenMetadataClient client, String id) {
this.client = client;
this.id = id;
}

public org.openmetadata.sdk.models.AsyncJobResponse execute() {
try {
return client.tables().restoreServerAsync(id);
} catch (org.openmetadata.sdk.exceptions.OpenMetadataException e) {
throw new RuntimeException(e);
}
}
Comment on lines +308 to +332
public Database execute() {
try {
return client.databases().restore(id);
} catch (org.openmetadata.sdk.exceptions.OpenMetadataException e) {
throw new RuntimeException(e);
}
}
}

public static class AsyncDatabaseRestorer {
private final OpenMetadataClient client;
private final String id;

public AsyncDatabaseRestorer(OpenMetadataClient client, String id) {
this.client = client;
this.id = id;
}

public org.openmetadata.sdk.models.AsyncJobResponse execute() {
try {
return client.databases().restoreServerAsync(id);
} catch (org.openmetadata.sdk.exceptions.OpenMetadataException e) {
throw new RuntimeException(e);
}
}
Comment on lines +81 to +82
return cls(
job_id=str(payload.get("jobId", "")),
Comment on lines +47 to +48
* empty inputs and invokes the {@code restoreAdditionalChildren} extension hook once per
* restored entity.
…cade

Two follow-up improvements to the bulk restore introduced for #4003.

Batched findTo per tree level: bulkRestoreSubtree previously issued one
findTo per parent during recursion, which at the schemas → tables level
of a 12k-table database meant 12k DB round trips just to enumerate
children. The new bulkRestoreContainedChildren helper does one
findToBatchAllTypes per tree level regardless of fan-out, then groups
the results by child type and dispatches to each repo's
bulkRestoreSubtree. DashboardRepository's chart-restore logic moves
from the now-bypassed restoreChildren override to the existing
restoreAdditionalChildren extension hook so it still runs both for
direct dashboard restores and when dashboards are descendants of a
larger restore.

Symmetric bulk soft-delete cascade: deleteByName/deleteById had the
same per-entity recursion that this PR fixed for restore — soft-
deleting a database with 12k tables ran 12k recursive
Entity.deleteEntity calls, each writing one row + one ES update + one
change event. New bulkSoftDeleteSubtree mirrors bulkRestoreSubtree:
one batched findToBatchAllTypes per level, deferred-store DB writes,
batched version history, batched change events, batched cache
invalidation; per-descendant ES writes are skipped because the existing
deleteFromSearch cascade flips the deleted flag on descendant indexes
in one update_by_query. deleteChildren(List, hardDelete=false, ...)
now dispatches to the bulk path; hard-delete keeps the existing
batchDeleteChildren path. New softDeleteAdditionalChildren extension
hook mirrors restoreAdditionalChildren; DashboardRepository's chart
soft-delete migrates onto it for the same reason.

Tests: extends EntityRepositoryRestoreTest with cases that verify
findToBatchAllTypes is invoked exactly once per level (not once per
parent) for both bulk operations, plus the existing grouping/dispatch
shape for the soft-delete entry point. Extends RestoreHierarchyIT
with a recursive soft-delete cascade assertion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

The Java checkstyle failed.

Please run mvn spotless:apply in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

Comment on lines +5677 to +5691
private void bulkRestoreContainedChildren(List<T> parents, String updatedBy) {
List<String> parentIds = new ArrayList<>(parents.size());
for (T parent : parents) {
parentIds.add(parent.getId().toString());
}
List<CollectionDAO.EntityRelationshipObject> relationships;
try (var ignored = phase("bulkRestoreFindChildren")) {
relationships =
daoCollection
.relationshipDAO()
.findToBatchAllTypes(parentIds, Relationship.CONTAINS.ordinal(), ALL);
}
if (relationships.isEmpty()) {
return;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: bulkRestore/SoftDeleteContainedChildren are near-identical duplicates

The two private methods bulkRestoreContainedChildren (lines 5677-5705) and bulkSoftDeleteContainedChildren (lines 5825-5853) share identical logic — collect parent IDs, call findToBatchAllTypes, filter by entityType, group by child type, dispatch to the child repo. The only difference is the terminal call (bulkRestoreSubtree vs bulkSoftDeleteSubtree). Per the 'No Duplication' principle, extract a shared helper that accepts the dispatch function.

Suggested fix:

private void dispatchToContainedChildren(
    List<T> parents, String updatedBy, String phaseName,
    BiConsumer<EntityRepository<?>, List<UUID>> dispatcher) {
  List<String> parentIds = parents.stream()
      .map(p -> p.getId().toString()).toList();
  List<CollectionDAO.EntityRelationshipObject> rels;
  try (var ignored = phase(phaseName)) {
    rels = daoCollection.relationshipDAO()
        .findToBatchAllTypes(parentIds, Relationship.CONTAINS.ordinal(), ALL);
  }
  if (rels.isEmpty()) return;
  Map<String, List<UUID>> idsByType = new HashMap<>();
  for (var rel : rels) {
    if (!entityType.equals(rel.getFromEntity())) continue;
    idsByType.computeIfAbsent(rel.getToEntity(), k -> new ArrayList<>())
        .add(UUID.fromString(rel.getToId()));
  }
  idsByType.forEach((type, ids) -> dispatcher.accept(
      Entity.getEntityRepository(type), ids));
}

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Comment on lines +5735 to +5749
@Transaction
public final void bulkSoftDeleteSubtree(List<UUID> ids, String updatedBy) {
if (ids == null || ids.isEmpty()) {
return;
}
if (!supportsSoftDelete) {
for (UUID id : ids) {
Entity.deleteEntity(updatedBy, entityType, id, true, true);
}
return;
}
List<T> entities;
try (var ignored = phase("bulkSoftDeleteLoad")) {
entities = find(ids, NON_DELETED);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: bulkSoftDeleteSubtree exceeds 15-line method limit (83 lines)

bulkSoftDeleteSubtree spans lines 5736-5818 (83 lines), well above the project's 15-line method guideline. The method has clear phases (guard/load, pre-delete hooks, child cascade, updater creation, version history, persistence, cache invalidation, change events) that can each be extracted into focused private methods, matching the phase-based structure already hinted at by the try (var ignored = phase(...)) calls.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 8, 2026

Code Review ⚠️ Changes requested 1 resolved / 5 findings

Implements bulk-restore and async-restore paths to optimize large entity hierarchy recovery, but introduces a request-scope violation in the async lambda and several maintenance issues. Addressing the captured uriInfo usage and consolidating duplicated logic in the restorer and bulk-processing methods is required.

⚠️ Bug: uriInfo used in async lambda after request scope ends

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:822

In restoreEntityAsync, the request-scoped uriInfo object is captured by the lambda submitted to the executor (line 822: addHref(uriInfo, response.getEntity())). After the HTTP response is sent (202 Accepted), the JAX-RS container may invalidate or recycle the UriInfo instance, leading to IllegalStateException or incorrect URL generation when addHref calls uriInfo.getBaseUri().

The existing deleteByIdAsync method does NOT use uriInfo inside its async lambda, confirming this is an inconsistency introduced by this PR. The addHref call is non-essential for the async path since the restored entity is only sent via WebSocket notification (not as an HTTP response body), and the WebSocket message uses entity.getName() not HREFs.

Suggested fix
Remove the `addHref` call from the async lambda. The WebSocket notification only uses `entity.getName()`, so HREFs are unnecessary:

              try {
                PutResponse<T> response = repository.restoreEntity(userName, id);
                if (response == null) {
                  // ... existing error handling
                  return;
                }
                repository.restoreFromSearch(response.getEntity());
-               addHref(uriInfo, response.getEntity());
                LOG.info(...);
                WebsocketNotificationHandler.sendRestoreOperationCompleteNotification(
                    jobId, securityContext, response.getEntity());
💡 Quality: Fluent restorer classes duplicate identical pattern across entities

📄 openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Tables.java:389-403 📄 openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Databases.java:295-309

The TableRestorer/AsyncTableRestorer and DatabaseRestorer/AsyncDatabaseRestorer classes are structurally identical — only the entity type and service accessor differ. This will grow linearly as more entity types gain the async restore feature (the PR already lists 8 resources). Consider extracting a generic EntityRestorer<T> to avoid duplication per the project's 'No Duplication' principle.

💡 Quality: bulkRestore/SoftDeleteContainedChildren are near-identical duplicates

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5677-5691 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5825-5839

The two private methods bulkRestoreContainedChildren (lines 5677-5705) and bulkSoftDeleteContainedChildren (lines 5825-5853) share identical logic — collect parent IDs, call findToBatchAllTypes, filter by entityType, group by child type, dispatch to the child repo. The only difference is the terminal call (bulkRestoreSubtree vs bulkSoftDeleteSubtree). Per the 'No Duplication' principle, extract a shared helper that accepts the dispatch function.

Suggested fix
private void dispatchToContainedChildren(
    List<T> parents, String updatedBy, String phaseName,
    BiConsumer<EntityRepository<?>, List<UUID>> dispatcher) {
  List<String> parentIds = parents.stream()
      .map(p -> p.getId().toString()).toList();
  List<CollectionDAO.EntityRelationshipObject> rels;
  try (var ignored = phase(phaseName)) {
    rels = daoCollection.relationshipDAO()
        .findToBatchAllTypes(parentIds, Relationship.CONTAINS.ordinal(), ALL);
  }
  if (rels.isEmpty()) return;
  Map<String, List<UUID>> idsByType = new HashMap<>();
  for (var rel : rels) {
    if (!entityType.equals(rel.getFromEntity())) continue;
    idsByType.computeIfAbsent(rel.getToEntity(), k -> new ArrayList<>())
        .add(UUID.fromString(rel.getToId()));
  }
  idsByType.forEach((type, ids) -> dispatcher.accept(
      Entity.getEntityRepository(type), ids));
}
💡 Quality: bulkSoftDeleteSubtree exceeds 15-line method limit (83 lines)

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5735-5749

bulkSoftDeleteSubtree spans lines 5736-5818 (83 lines), well above the project's 15-line method guideline. The method has clear phases (guard/load, pre-delete hooks, child cascade, updater creation, version history, persistence, cache invalidation, change events) that can each be extracted into focused private methods, matching the phase-based structure already hinted at by the try (var ignored = phase(...)) calls.

✅ 1 resolved
Bug: bulkRestoreSubtree calls restoreChildren outside @transaction boundary

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5585-5587
In bulkRestoreSubtree (line 5585-5587), restoreChildren is called for each entity in a loop before the batched DB writes (updateMany, insertMany, etc.) at lines 5629-5641. While JDBI's @Transaction on both methods means they participate in the same transaction when called on the same thread, the recursive descent through restoreChildrenbulkRestoreSubtree means children are restored (DB committed) before parents in a bottom-up fashion. If the process crashes mid-way through a large hierarchy, some children may be restored while their parent remains deleted, leaving the tree in an inconsistent state.

This is acceptable for the use case (restore is idempotent and can be retried), but worth documenting.

🤖 Prompt for agents
Code Review: Implements bulk-restore and async-restore paths to optimize large entity hierarchy recovery, but introduces a request-scope violation in the async lambda and several maintenance issues. Addressing the captured `uriInfo` usage and consolidating duplicated logic in the restorer and bulk-processing methods is required.

1. ⚠️ Bug: uriInfo used in async lambda after request scope ends
   Files: openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:822

   In `restoreEntityAsync`, the request-scoped `uriInfo` object is captured by the lambda submitted to the executor (line 822: `addHref(uriInfo, response.getEntity())`). After the HTTP response is sent (202 Accepted), the JAX-RS container may invalidate or recycle the `UriInfo` instance, leading to `IllegalStateException` or incorrect URL generation when `addHref` calls `uriInfo.getBaseUri()`.
   
   The existing `deleteByIdAsync` method does NOT use `uriInfo` inside its async lambda, confirming this is an inconsistency introduced by this PR. The `addHref` call is non-essential for the async path since the restored entity is only sent via WebSocket notification (not as an HTTP response body), and the WebSocket message uses `entity.getName()` not HREFs.

   Suggested fix:
   Remove the `addHref` call from the async lambda. The WebSocket notification only uses `entity.getName()`, so HREFs are unnecessary:
   
                 try {
                   PutResponse<T> response = repository.restoreEntity(userName, id);
                   if (response == null) {
                     // ... existing error handling
                     return;
                   }
                   repository.restoreFromSearch(response.getEntity());
   -               addHref(uriInfo, response.getEntity());
                   LOG.info(...);
                   WebsocketNotificationHandler.sendRestoreOperationCompleteNotification(
                       jobId, securityContext, response.getEntity());

2. 💡 Quality: Fluent restorer classes duplicate identical pattern across entities
   Files: openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Tables.java:389-403, openmetadata-sdk/src/main/java/org/openmetadata/sdk/fluent/Databases.java:295-309

   The `TableRestorer`/`AsyncTableRestorer` and `DatabaseRestorer`/`AsyncDatabaseRestorer` classes are structurally identical — only the entity type and service accessor differ. This will grow linearly as more entity types gain the async restore feature (the PR already lists 8 resources). Consider extracting a generic `EntityRestorer<T>` to avoid duplication per the project's 'No Duplication' principle.

3. 💡 Quality: bulkRestore/SoftDeleteContainedChildren are near-identical duplicates
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5677-5691, openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5825-5839

   The two private methods `bulkRestoreContainedChildren` (lines 5677-5705) and `bulkSoftDeleteContainedChildren` (lines 5825-5853) share identical logic — collect parent IDs, call `findToBatchAllTypes`, filter by `entityType`, group by child type, dispatch to the child repo. The only difference is the terminal call (`bulkRestoreSubtree` vs `bulkSoftDeleteSubtree`). Per the 'No Duplication' principle, extract a shared helper that accepts the dispatch function.

   Suggested fix:
   private void dispatchToContainedChildren(
       List<T> parents, String updatedBy, String phaseName,
       BiConsumer<EntityRepository<?>, List<UUID>> dispatcher) {
     List<String> parentIds = parents.stream()
         .map(p -> p.getId().toString()).toList();
     List<CollectionDAO.EntityRelationshipObject> rels;
     try (var ignored = phase(phaseName)) {
       rels = daoCollection.relationshipDAO()
           .findToBatchAllTypes(parentIds, Relationship.CONTAINS.ordinal(), ALL);
     }
     if (rels.isEmpty()) return;
     Map<String, List<UUID>> idsByType = new HashMap<>();
     for (var rel : rels) {
       if (!entityType.equals(rel.getFromEntity())) continue;
       idsByType.computeIfAbsent(rel.getToEntity(), k -> new ArrayList<>())
           .add(UUID.fromString(rel.getToId()));
     }
     idsByType.forEach((type, ids) -> dispatcher.accept(
         Entity.getEntityRepository(type), ids));
   }

4. 💡 Quality: bulkSoftDeleteSubtree exceeds 15-line method limit (83 lines)
   Files: openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:5735-5749

   `bulkSoftDeleteSubtree` spans lines 5736-5818 (83 lines), well above the project's 15-line method guideline. The method has clear phases (guard/load, pre-delete hooks, child cascade, updater creation, version history, persistence, cache invalidation, change events) that can each be extracted into focused private methods, matching the phase-based structure already hinted at by the `try (var ignored = phase(...))` calls.

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 8, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🔴 Playwright Results — 1 failure(s), 12 flaky

✅ 4015 passed · ❌ 1 failed · 🟡 12 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
🔴 Shard 1 298 1 0 4
🟡 Shard 2 751 0 4 8
🟡 Shard 3 757 0 2 7
🟡 Shard 4 789 0 1 18
🟡 Shard 5 686 0 1 41
🟡 Shard 6 734 0 4 8

Genuine Failures (failed on all attempts)

Features/Dashboards.spec.ts › should be able to toggle between deleted and non-deleted charts (shard 1)
Error: �[2mexpect(�[22m�[31mlocator�[39m�[2m).�[22mtoBeVisible�[2m(�[22m�[2m)�[22m failed

Locator: getByTestId('charts-table').getByTestId('no-data-placeholder')
Expected: visible
Timeout: 15000ms
Error: element(s) not found

Call log:
�[2m  - Expect "toBeVisible" with timeout 15000ms�[22m
�[2m  - waiting for getByTestId('charts-table').getByTestId('no-data-placeholder')�[22m

🟡 12 flaky test(s) (passed on retry)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/CuratedAssets.spec.ts › Test Search Indexes with display name filter (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 2 retries)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/Table.spec.ts › Tags term should be consistent for search (shard 3, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Description Rule Is_Not_Set (shard 4, 1 retry)
  • Pages/ExplorePageRightPanel.spec.ts › Should verify deleted user not visible in owner selection for container (shard 5, 1 retry)
  • Pages/Glossary.spec.ts › Column dropdown drag-and-drop functionality for Glossary Terms table (shard 6, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)
  • Pages/Tag.spec.ts › Verify Owner Add Delete (shard 6, 1 retry)
  • Pages/UserDetails.spec.ts › Admin user can edit teams from the user profile (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants