Skip to content

Add validations for input/output ports#25601

Open
harshach wants to merge 9 commits intomainfrom
dp_io_1
Open

Add validations for input/output ports#25601
harshach wants to merge 9 commits intomainfrom
dp_io_1

Conversation

@harshach
Copy link
Collaborator

@harshach harshach commented Jan 29, 2026

Add validations for input/output ports

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Port exclusivity validation:
    • Assets cannot be simultaneously designated as both input and output ports on the same data product
  • Output port prerequisite check:
    • Output ports must first belong to the data product assets (HAS relationship), enforced via existsRelationship query in CollectionDAO.java
  • Bulk operation handling:
    • Partial success support with separate tracking of failed operations and descriptive error messages
  • New integration tests:
    • 7 tests in DataProductResourceIT.java covering exclusivity enforcement, prerequisite validation, port movement, asymmetric input/output behavior, and asset deletion cleanup

This will update automatically on new commits.


@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 4, 2026

@gitar-bot
Copy link

gitar-bot bot commented Feb 7, 2026

🔍 CI failure analysis for 2b06abf: maven-sonarcloud-ci shows same 6 DataProductResourceTest failures as maven-postgresql-ci. All 7 Data Product tests (6 unit + 1 integration) fail consistently across all CI configurations, all need same fix: add assets before output ports.

CI Failure Analysis - Complete Picture

New Failure: maven-sonarcloud-ci (Job 62857565008)

Result: Build completed with failures

6 DataProductResourceTest Unit Test Failures - RELATED TO PR:

  1. testDataProductBulkOutputPorts:850 - FAILURE
  2. testDataProductBulkPortsViaApi:911 - ERROR
  3. testDataProductDomainMigrationWithInputOutputPorts:1732 - FAILURE
  4. testGetOutputPortsReturnsFullEntities:1012 - FAILURE
  5. testGetPortsByNameEndpoints:1108 - FAILURE
  6. testGetPortsViewEndpoint:1066 - FAILURE

These are the SAME 6 unit test failures seen in maven-postgresql-ci (Job 62857565578), confirming 100% consistency across both PostgreSQL and SonarCloud builds.


Previous Failure: maven-postgresql-ci (Job 62857565578)

Same 6 DataProductResourceTest failures with identical errors


Root Cause Analysis

These unit tests call bulkAddOutputPorts() WITHOUT first calling bulkAddAssets(). The PR's new validation requires entities to be added as data product assets (HAS relationship) BEFORE they can be added as output ports.

The Solution:

Update DataProductResourceTest.java tests to add assets before output ports:

// Add entity as data product asset first
BulkAssets assetRequest = new BulkAssets().withAssets(outputPorts);
repository.bulkAddAssets(product.getFullyQualifiedName(), assetRequest);

// Then add as output port
repository.bulkAddOutputPorts(product.getFullyQualifiedName(), bulkAssets);

Test Report (Job 62863799777)

Status: Consequence of test failures above


Integration Test Failure (Still Applicable)

Test: DataProductDomainMigrationIT.testDataProductDomainMigrationWithInputOutputPorts

Status: Failing consistently in BOTH MySQL and PostgreSQL integration test builds

Same root cause and solution - needs to add assets before output ports at line 189-190


Complete Picture: 7 Total Test Failures

Unit Tests (6): DataProductResourceTest.java

  • Consistently failing in BOTH maven-postgresql-ci AND maven-sonarcloud-ci
  • 100% reproducible

Integration Test (1): DataProductDomainMigrationIT.java

  • Consistently failing in BOTH MySQL-Elasticsearch AND PostgreSQL-OpenSearch builds
  • 100% reproducible

Root Cause (All 7 Tests): Call bulkAddOutputPorts() without first calling bulkAddAssets(), violating the PR's new validation requirement.


Summary

PR Implementation: Excellent validation logic with code review optimizations

E2E Tests (InputOutputPorts.spec.ts): Already fixed

⚠️ Actions Required - All 7 test failures have the SAME root cause:

  1. Update DataProductResourceTest.java - 6 unit tests (failing in BOTH maven-postgresql-ci and maven-sonarcloud-ci)
  2. Update DataProductDomainMigrationIT.java - 1 integration test (failing in BOTH MySQL and PostgreSQL)

📊 Consistency: Identical failures across multiple CI configurations confirm legitimate test issues

🔴 Unrelated: Python ingestion tests, Playwright failures

Code Review 👍 Approved with suggestions 4 resolved / 5 findings

Well-structured validation for input/output port exclusivity and output port prerequisites, with proper partial success handling and good test coverage. Both previous performance and edge case findings are resolved. One minor dead code issue with the unused existsRelationship DAO method.

💡 Quality: Unused existsRelationship method added to CollectionDAO

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:1805-1813

The existsRelationship method was added to CollectionDAO.java (lines 1805-1813) but is never called anywhere in the codebase. The validation logic in DataProductRepository.executeBulkPortsOperation uses a pre-loaded Set<UUID> approach instead (lines 307-323), which is actually the better pattern for batch operations.

Consider removing this dead code to keep the DAO interface clean, or add a comment explaining if it's intended for future use.

✅ 4 resolved
Edge Case: existsRelationship query doesn't verify toEntity type

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/CollectionDAO.java:1796-1807
The existsRelationship() SQL query in CollectionDAO.java doesn't include toEntity in its WHERE clause, unlike similar methods in the DAO (e.g., deleteIfExistsByParams which includes toEntity).

While this may work correctly for the current use case where the entity IDs are unique, it could theoretically match relationships with different toEntity types if the same UUID is used across different entity types. This is unlikely to cause issues in practice since UUIDs are globally unique, but for consistency with other relationship queries, consider adding toEntity to the query:

SELECT COUNT(*) FROM entity_relationship 
WHERE fromId = :fromId AND toId = :toId 
AND fromEntity = :fromEntity AND toEntity = :toEntity 
AND relation = :relation

This would require adding a toEntity parameter to the method signature.

Performance: Two database queries per asset in validation loop

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataProductRepository.java:295
For each asset being added as an output port, the code makes two separate database queries:

  1. existsRelationship() to check if asset exists in opposite port type
  2. existsRelationship() to check if asset is a data product asset

For bulk operations with many assets, this results in 2N database round trips. Consider batching these checks using bulk queries that accept multiple IDs to improve performance for large bulk operations.

Bug: Status set to PARTIAL_SUCCESS prematurely on first failure

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataProductRepository.java:358
In executeBulkPortsOperation(), the status is set to PARTIAL_SUCCESS immediately when the first asset fails validation (lines 358 and 379), even though subsequent assets haven't been processed yet. If all assets end up failing, the status is correctly updated to FAILURE at the end (lines 395-397). However, this approach is correct because it's later refined.

Actually, upon closer inspection, this is handled correctly - the status is refined at lines 395-397 where it's set to FAILURE if success.isEmpty() && !failed.isEmpty(). The initial PARTIAL_SUCCESS assignment is just a preliminary value.

Let me reconsider - the logic is actually correct. Ignore this finding.

Bug: Change description generated even when all operations fail

📄 openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/DataProductRepository.java:395-407
In DataProductRepository.java, when all bulk operations fail, the change description is still potentially generated if !success.isEmpty() (line 399), but the issue is with the logic flow.

Looking at lines 399-407, the code generates a change description when !success.isEmpty(). However, the code block continues to set field names on the change description even when there are no successful operations (since it's inside the same block).

Actually, re-reviewing: the condition if (!success.isEmpty()) at line 399 correctly guards the change description generation block. This is correct.

However, there's a subtle issue: after line 397 where status is set to FAILURE, the code at line 399 checks if (!success.isEmpty()) which would be false, so the change description block is skipped correctly.

Let me reconsider the actual issue here - the logic appears sound. Withdrawing this finding.

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants