Skip to content

Conversation

@harshach
Copy link
Collaborator

@harshach harshach commented Feb 5, 2026

This pull request introduces significant improvements to how bulk operations are managed, focusing on reducing database queries and connections through batch processing. The implementation uses batch database operations with proper authorization checks and graceful fallback mechanisms to handle bulk imports/exports efficiently.

Bulk Operation Improvements:

  • Implemented batch database operations for bulk create and update operations, reducing the number of database queries from N individual queries to optimized batch operations. This significantly reduces database connection usage and improves performance.
  • Added pre-fetching of existing entities in a single query, enabling efficient conflict detection and update preparation before database writes.
  • Implemented batch authorization checks that validate permissions for all entities before processing, preventing wasted database work on unauthorized operations.
  • Added deferred store pattern for updates (updateWithDeferredStore) that enables batch database writes while maintaining proper lifecycle event handling.
  • Graceful fallback to per-entity processing when batch operations fail, ensuring reliability.

Bulk Write/Update Methods:

  • Added bulkWrite() and bulkUpdate() methods to EntityDAO for efficient batch database operations.
  • Refactored EntityRepository bulk operations to use new batch methods, with proper error handling and authorization.
  • Updated EntityResource to handle bulk operations with async 202 responses, returning both immediate auth failures and processing results.

Configuration and Setup:

  • Simplified BulkOperationConfiguration by removing complex semaphore-based throttling in favor of simpler batch processing.
  • Updated configuration for bulk operations with queue size and thread pool settings.
  • Tuned HikariCP connection pool settings for faster failure detection and more aggressive resource reclamation.

Testing and Code Quality:

  • Added comprehensive integration tests in BulkOperationPermissionsIT covering create, update, permission checks, and concurrency scenarios.
  • Updated all entity resource integration tests with bulk operation test coverage.
  • Tests validate batch operations work correctly across many entity types (API endpoints, collections, databases, schemas, tables, teams, test cases, etc.).

These changes together provide more robust, predictable, and scalable handling of bulk operations by reducing database query overhead and connection usage through efficient batch processing.

.withRequest(updater.getUpdated().getFullyQualifiedName())
.withStatus(Status.OK.getStatusCode()));
}
} catch (Exception batchError) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Fallback storeUpdate() may duplicate version history entries

When the batch DB write in bulkUpdateEntities partially succeeds — for example, the batch history inserts via insertMany succeed but the subsequent updateMany for entity rows fails — the catch block falls back to per-entity storeUpdate() for all updaters.

storeUpdate() calls storeEntityHistory() which inserts the same version extension record that was already inserted by the batch insertMany. This can produce duplicate version history entries for entities whose history was already recorded in the successful batch insert.

Additionally, storeUpdate() calls updateVersion() which re-computes the version from original.getVersion(). While the version computation is idempotent (same input → same output), the combination of already-written DB state + full storeUpdate() is fragile.

Suggested fix: Track which batch operations succeeded to avoid re-processing. For example:

  1. Wrap the batch history insert and entity update in a single DB transaction so they either both succeed or both fail.
  2. Or, split the fallback to skip history insertion if it was already done, e.g. by catching exceptions separately for each batch step and only falling back on the step that failed.

Was this helpful? React with 👍 / 👎

.withRequest(entity.getFullyQualifiedName())
.withStatus(Status.BAD_REQUEST.getStatusCode())
.withMessage(e.getMessage()));
createManyEntities(newEntities);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Batch create path skips writeThroughCache for new entities

The batch create path calls createManyEntities(newEntities) at line 7877 which does NOT call writeThroughCache or writeThroughCacheMany. In contrast, the single-entity createNewEntity() (used in the fallback path at line 7894 via createOrUpdate) does call writeThroughCache(entity, false).

This means when batch creation succeeds (the happy path), newly created entities are never written to the Redis distributed cache. Subsequent reads may experience cache misses and hit the database, partially negating the performance benefit of batch operations. In a multi-node deployment, other nodes won't see the new entities in their cache until they're naturally loaded.

Suggested fix: Add a writeThroughCacheMany(newEntities, false) call after the successful createManyEntities call, analogous to how the import path handles it:

createManyEntities(newEntities);
writeThroughCacheMany(newEntities, false);  // Add this

Was this helpful? React with 👍 / 👎

@gitar-bot
Copy link

gitar-bot bot commented Feb 6, 2026

🔍 CI failure analysis for 206fbed: 1 test failure in TableResourceIT.test_tableLineage - Elasticsearch version conflict during lineage update. Likely unrelated to bulk operations changes.

Test Failure (Likely Unrelated to PR)

Issue

Single test failure in table lineage functionality

Failed Test

TableResourceIT.test_tableLineage

Root Cause

Elasticsearch version conflict during search index update. The test attempts to update a database service in the search index but encounters a version conflict: required seqNo [385] vs current document has seqNo [549]

Error Details

  • Error: ApiException (500) with Elasticsearch 409 Conflict
  • Operation: POST /openmetadata_database_service_search_index/_update_by_query
  • Conflict: Version conflict - document was modified between read and write operations
  • Index: openmetadata_database_service_search_index_rebuild_1770370841629
  • Document ID: 5b50a6e5-f226-4133-8883-c62ec9044926
  • Test Location: TableResourceIT.java:2534
  • Jobs: 62730929664 (MySQL+ES), 62734630837 (Test Report)
  • Commit: 206fbed (10 commits total)

Relation to PR

LIKELY UNRELATED - This PR modifies bulk operations code (EntityRepository, EntityResource, BulkExecutor). The failure is in lineage functionality with Elasticsearch search index versioning. No lineage-related code was modified.

Analysis

This appears to be a race condition or timing issue in the search index update mechanism during lineage operations, not related to the bulk operations implementation. The PR does modify SearchIndexResourceIT.java but only to add bulk operation tests, not to change lineage or search index update logic.

Context

Major Progress: The 17+ compilation errors that blocked all CI jobs have been resolved. Tests are now actually running and only 1 test is failing (out of 162 tests in this suite).

Code Review ⚠️ Changes requested 13 resolved / 17 findings

Well-structured batch operations refactor that significantly reduces DB queries. Three prior findings remain unresolved: fallback path risks duplicate history entries, batch create skips Redis cache, and auth uses EDIT_ALL for all update fields.

⚠️ Bug: Fallback storeUpdate() may duplicate version history entries

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7815 📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7819

When the batch DB write in bulkUpdateEntities partially succeeds — for example, the batch history inserts via insertMany succeed but the subsequent updateMany for entity rows fails — the catch block falls back to per-entity storeUpdate() for all updaters.

storeUpdate() calls storeEntityHistory() which inserts the same version extension record that was already inserted by the batch insertMany. This can produce duplicate version history entries for entities whose history was already recorded in the successful batch insert.

Additionally, storeUpdate() calls updateVersion() which re-computes the version from original.getVersion(). While the version computation is idempotent (same input → same output), the combination of already-written DB state + full storeUpdate() is fragile.

Suggested fix: Track which batch operations succeeded to avoid re-processing. For example:

  1. Wrap the batch history insert and entity update in a single DB transaction so they either both succeed or both fail.
  2. Or, split the fallback to skip history insertion if it was already done, e.g. by catching exceptions separately for each batch step and only falling back on the step that failed.
⚠️ Bug: Batch create path skips writeThroughCache for new entities

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7877

The batch create path calls createManyEntities(newEntities) at line 7877 which does NOT call writeThroughCache or writeThroughCacheMany. In contrast, the single-entity createNewEntity() (used in the fallback path at line 7894 via createOrUpdate) does call writeThroughCache(entity, false).

This means when batch creation succeeds (the happy path), newly created entities are never written to the Redis distributed cache. Subsequent reads may experience cache misses and hit the database, partially negating the performance benefit of batch operations. In a multi-node deployment, other nodes won't see the new entities in their cache until they're naturally loaded.

Suggested fix: Add a writeThroughCacheMany(newEntities, false) call after the successful createManyEntities call, analogous to how the import path handles it:

createManyEntities(newEntities);
writeThroughCacheMany(newEntities, false);  // Add this
⚠️ Bug: Fallback path in bulkUpdateEntities calls storeUpdate() after deferred update

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7817

In bulkUpdateEntities(), each updater calls updateWithDeferredStore() which sets up change descriptions, relationships, and computes version changes. If the batch DB write fails, the fallback calls updater.storeUpdate(). However, storeUpdate() is designed for the normal update path - it calls updateInternal() again and then stores the entity. Since updateWithDeferredStore() already called updateInternal(), calling storeUpdate() will re-execute the update logic (relationship changes, etc.) a second time, potentially causing duplicate relationship insertions or other side effects.

The fallback should instead perform only the store operations (entity row write + version history insert) for each updater, rather than re-running the full update lifecycle via storeUpdate().

Impact: If batch writes fail and fallback engages, entities may get duplicate relationships or incorrect change descriptions due to double execution of updateInternal().

Suggested fix: Create a dedicated storeOnly() method on EntityUpdater that only writes the entity row and version history without re-running update logic, or restructure the fallback to only perform the individual DB writes that the batch write was supposed to do.

💡 Edge Case: Auth check uses EDIT_ALL for all updates regardless of fields changed

📄 openmetadata-service/src/main/java/org/openmetadata/service/resources/EntityResource.java:1156

In EntityResource.bulkCreateOrUpdate() Phase 3 (line 1156), for existing entities, the authorization check always uses MetadataOperation.EDIT_ALL. In the non-bulk PUT path, the authorization may be more granular (e.g., EDIT_DESCRIPTION for description-only changes). Using EDIT_ALL is more restrictive than necessary, which could cause bulk updates from users with limited permissions (e.g., users who can edit descriptions but not all fields) to be rejected when the equivalent individual PUT would succeed.

This is noted as a "hardcoded" permission that could be surprising to users.

Impact: Users with granular permissions (e.g., EDIT_DESCRIPTION only) may fail bulk operations even though individual operations would succeed.

Suggested fix: Consider computing the appropriate MetadataOperation based on which fields actually changed between the existing and new entity, similar to how the normal PUT path works. Alternatively, document this as a known limitation of the bulk API.

✅ 13 resolved
Bug: Batch create success path skips lifecycle hooks (prepare, store, postCreate)

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7877
In bulkCreateOrUpdateEntitiesSequential(), the happy path for new entity creation calls createManyEntities(newEntities) (line 7877). If this batch create succeeds, individual entities skip the full lifecycle: storeEntity(), storeRelationships(), postCreate(), and other hooks that createOrUpdate() normally invokes. The createManyEntities() method likely only does the DB insert, but relationship creation, search indexing setup, and other post-creation hooks may be missed.

If createManyEntities() does handle all lifecycle aspects, this concern is mitigated. But given the fallback at line 7894 calls the full createOrUpdate() which includes all lifecycle hooks, there's an asymmetry suggesting the batch path may be incomplete.

Impact: Batch-created entities may be missing relationships, search index entries, or other post-creation artifacts compared to individually created entities.

Suggested fix: Verify that createManyEntities() handles the full entity lifecycle, or add explicit calls to storeRelationships() and postCreate() for each entity after batch insert.

Edge Case: Duplicate FQN entities share same object reference in batch

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7919
In bulkCreateOrUpdateEntitiesSequential(), when duplicate FQNs are detected within a batch (lines 7865-7866), the second occurrence is placed in updateEntities. However, the entity object for the duplicate is the same object reference that was placed in newEntities (or a different object with the same FQN from the input list).

When the update path runs and calls existingByFqn.get(entity.getFullyQualifiedName()) to get the "original", it retrieves the freshly created entity. This relies on findEntityByNames (line 7927) correctly fetching the just-created entity. If createManyEntities hasn't flushed/committed yet when findEntityByNames runs, the lookup could fail.

The code does handle this case with the filter at lines 7935-7946, marking unfound entities as failed. However, this means duplicate FQNs in a batch where the first create fails will silently mark both as failed rather than retrying the second as a fresh create.

This is a minor edge case since duplicate FQNs in a single batch is unusual.

Edge Case: Unhandled setFieldsInBulk exception loses all bulk results

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7722
In bulkUpdateEntities, setFieldsInBulk(putFields, originals) at line 7722 is called before the per-entity loop but has no try-catch around it. If this call throws (e.g., a database error loading relationship fields in batch), the exception propagates through bulkUpdateEntitiesbulkCreateOrUpdateEntitiesSequential → caller without being caught.

Impact:

  1. Any entities already batch-created in the create phase have been committed to the DB, but their success responses are lost (the result object is never returned).
  2. The caller receives an exception instead of a partial result, making it impossible to know which entities succeeded.
  3. In the async path, the catch at line 7660 sets numberOfRowsFailed = entities.size(), which is incorrect since some entities may have been successfully created.

Suggested fix: Wrap the setFieldsInBulk call in a try-catch within bulkUpdateEntities. On failure, fall back to per-entity processing (similar to how batch create falls back):

try {
    setFieldsInBulk(putFields, originals);
} catch (Exception e) {
    LOG.warn("Batch field loading failed, falling back to per-entity updates", e);
    for (T entity : updateEntities) {
        try {
            PutResponse<T> putResponse = createOrUpdate(uriInfo, entity, userName);
            // record success...
        } catch (Exception ex) {
            // record failure...
        }
    }
    return;
}
Edge Case: Batch update setFieldsInBulk may fail leaving updater list empty

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7696
In bulkUpdateEntities (line 7696), setFieldsInBulk(putFields, originals) is called before the per-entity loop to batch-load fields. If this method throws an exception (e.g., due to a corrupted entity in the batch), it will propagate uncaught and none of the entities in updateEntities will be processed — they won't appear in either successRequests or failedRequests.

This would cause the final result to have numberOfRowsProcessed that doesn't match the sum of passed + failed, making the result inconsistent.

Suggestion: Wrap setFieldsInBulk in a try-catch. On failure, either fall back to per-entity setFieldsInternal or add all update entities to failedRequests with a descriptive error message.

Edge Case: setFieldsInBulk exception loses all update entities

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/EntityRepository.java:7696
In bulkUpdateEntities (line 7696), setFieldsInBulk(putFields, originals) is called outside of any try-catch. If this throws an exception (e.g., database connectivity issue during field loading), the entire update batch is lost — none of the entities are reported as success or failure in the bulk result.

List<T> originals =
    updateEntities.stream()
        .map(e -> existingByFqn.get(e.getFullyQualifiedName()))
        .collect(Collectors.toList());
setFieldsInBulk(putFields, originals);  // unguarded — exception propagates up

The caller bulkCreateOrUpdateEntitiesSequential calls bulkUpdateEntities(...) but doesn't wrap it in try-catch either, so the exception would abort the entire operation mid-way.

Fix: Wrap setFieldsInBulk in a try-catch and fall back to per-entity processing if it fails:

try {
    setFieldsInBulk(putFields, originals);
} catch (Exception e) {
    LOG.warn("Batch field loading failed, falling back to per-entity updates", e);
    for (T entity : updateEntities) {
        // fall back to createOrUpdate per entity
    }
    return;
}

...and 8 more resolved from earlier reviews

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 6, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants