Skip to content

Add missing SDK capabilities for python#25687

Open
harshach wants to merge 1 commit intomainfrom
fix_python_sdk
Open

Add missing SDK capabilities for python#25687
harshach wants to merge 1 commit intomainfrom
fix_python_sdk

Conversation

@harshach
Copy link
Collaborator

@harshach harshach commented Feb 3, 2026

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • Extended Python SDK base entity class with delete_by_name(), add_vote(), and remove_vote() methods for enhanced entity management and community engagement features
  • Added profiler configuration methods (get_profiler_config, set_profiler_config, delete_profiler_config) to Databases and DatabaseSchemas classes for programmatic profiler management
  • Implemented 13 new table-specific methods including sample data operations, pipeline observability tracking, join information, data modeling, and custom metrics management
  • Refactored Tables.add_tag() and Tables.update_column_description() to use simpler copy.deepcopy pattern, replacing complex patch logic for improved code maintainability
  • Added add_test_connection_result() method to DatabaseServices for storing connection test results

@gitar-bot
Copy link

gitar-bot bot commented Feb 3, 2026

🔍 CI failure analysis for d7b2025: Three types of failures: (1) PyLint import violations in tables.py, (2) Type checking errors for TagLabel constructor and Column.description assignment, (3) Playwright E2E test failures that are unrelated to this PR's backend Python SDK changes and appear to be flaky/infrastructure issues.

Issue 1: PyLint Checkstyle Failure (py-checkstyle)

PyLint checkstyle failed with error code 16, achieving a score of 9.99/10 (below the required 10.0 threshold).

Root Cause

Three import statements in ingestion/src/metadata/sdk/entities/tables.py violate PyLint rule C0415 (import-outside-toplevel) by being placed inside method bodies instead of at module level:

  1. Line 22: import copy inside add_tag() method
  2. Line 27: from metadata.generated.schema.type.tagLabel import TagLabel inside add_tag() method
  3. Line 38: import copy inside update_column_description() method

Solution

The fix is to move both imports to the module-level imports section at the top of the file (after line 2). Since copy is used in both methods, only one module-level import is needed.


Issue 2: Type Checking Failures (py-run-tests)

All three py-run-tests jobs (Python 3.10 and 3.11) failed during static type checking with basedpyright, reporting 2 errors in ingestion/src/metadata/sdk/entities/tables.py.

Root Cause 1: TagLabel Missing Required Arguments

Line 29: TagLabel(tagFQN=tag_fqn, labelType="Manual", state="Confirmed")

Error: Arguments missing for parameters "name", "displayName", "description", "source", "href", "reason", "appliedAt", "appliedBy", "metadata"

The TagLabel constructor is being called with only 3 arguments (tagFQN, labelType, state), but the type checker reports that 9 additional required parameters are missing.

Root Cause 2: Type Mismatch for Column Description

Line 44: col.description = description

Error: Cannot assign to attribute "description" for class "Column". Type "str" is not assignable to type "Markdown | None"

The update_column_description method receives a description parameter as a plain str, but the Column.description attribute expects a Markdown type (or None).

Solutions for Type Checking

For TagLabel issue: The solution is to either:

  1. Provide all required arguments when creating the TagLabel instance, or
  2. Use optional parameters if the schema allows defaults, or
  3. Use a different constructor pattern that the codebase supports for creating tags

For Column description issue: The solution is to convert the string to a Markdown type before assignment:

from metadata.generated.schema.type.basic import Markdown
col.description = Markdown(description)  # or Markdown(root=description)

Issue 3: Playwright E2E Test Failures (playwright-ci-postgresql) - UNRELATED TO PR CHANGES

Assessment

9 Playwright E2E tests failed in the frontend test suite, testing UI functionality such as:

  • Incident Manager lifecycle
  • Table Search functionality (Services, API Collections, Data Models, Drives)
  • API Collection entity owner propagation
  • Schema table operations

Additionally, 8 tests were flaky (passed on retry).

Root Cause Analysis

This PR only modifies Python SDK backend files:

  • ingestion/src/metadata/sdk/entities/base.py
  • ingestion/src/metadata/sdk/entities/database_services.py
  • ingestion/src/metadata/sdk/entities/databases.py
  • ingestion/src/metadata/sdk/entities/databaseschemas.py
  • ingestion/src/metadata/sdk/entities/tables.py

The failing tests are frontend E2E tests that interact with the UI and test user-facing functionality. These backend Python SDK changes cannot cause frontend E2E test failures.

Error Patterns

The failures show typical E2E test instability issues:

  • Target page, context or browser has been closed (browser/context lifecycle issues)
  • Test timeout exceeded (timing/performance issues)
  • element(s) not found (element visibility/loading issues)
  • locator.fill: value: expected string, got undefined (data setup issues)

Conclusion

These Playwright failures appear to be flaky test failures or pre-existing issues unrelated to this PR's backend SDK changes. The PR should not be blocked on these failures. If these tests were passing on the base branch before this PR, it may indicate infrastructure or timing issues in the CI environment, not issues caused by the PR changes.

Code Review ⚠️ Changes requested 0 resolved / 2 findings

No new issues found. Two previous findings remain unresolved: DELETE request using data= for query parameters, and update_column_description silently succeeding when column is not found.

⚠️ Bug: DELETE request uses `data=` for query parameters

📄 ingestion/src/metadata/sdk/entities/base.py:414

In the delete_by_name method, the params dictionary containing recursive and hardDelete is passed as data=params to the REST client's delete method. However, these parameters should be URL query parameters, not request body data.

Most REST clients expect query parameters via a params= argument, not data=. This could cause the API call to fail or not include the expected parameters, resulting in incorrect behavior (e.g., default values for recursive/hardDelete instead of user-provided values).

Suggested fix:

rest_client.delete(f"{endpoint}/name/{fqn}", params=params)

Note: Verify the REST client's API to confirm the correct parameter name for query strings. If the REST client specifically uses data for query parameters on DELETE requests, then this is correct but uncommon.

💡 Edge Case: Column not found silently succeeds in `update_column_description`

📄 ingestion/src/metadata/sdk/entities/tables.py:42-46

The update_column_description method iterates through columns looking for a match by name. If no column matches, the method proceeds to call cls.update(patched) without any modification, silently succeeding without updating anything.

This could be confusing for SDK users who expect an error when trying to update a non-existent column.

Impact: Users may think they successfully updated a column description when nothing was actually changed, leading to data inconsistencies or debugging frustration.

Suggested fix:

def update_column_description(
    cls, table_id: UuidLike, column_name: str, description: str
) -> Table:
    """Update the description for a specific column."""
    import copy

    current = cls.retrieve(table_id, fields=["columns"])
    patched = copy.deepcopy(current)
    column_found = False
    for col in patched.columns or []:
        if str(getattr(col, "name", "")) == column_name:
            col.description = description
            column_found = True
            break
    if not column_found:
        raise ValueError(f"Column '{column_name}' not found in table")
    return cls.update(patched)

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

"recursive": str(recursive).lower(),
"hardDelete": str(hard_delete).lower(),
}
rest_client.delete(f"{endpoint}/name/{fqn}", data=params)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: DELETE request uses data= for query parameters

Details

In the delete_by_name method, the params dictionary containing recursive and hardDelete is passed as data=params to the REST client's delete method. However, these parameters should be URL query parameters, not request body data.

Most REST clients expect query parameters via a params= argument, not data=. This could cause the API call to fail or not include the expected parameters, resulting in incorrect behavior (e.g., default values for recursive/hardDelete instead of user-provided values).

Suggested fix:

rest_client.delete(f"{endpoint}/name/{fqn}", params=params)

Note: Verify the REST client's API to confirm the correct parameter name for query strings. If the REST client specifically uses data for query parameters on DELETE requests, then this is correct but uncommon.


Was this helpful? React with 👍 / 👎

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (10)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (20)

Package Vulnerability ID Severity Installed Version Fixed Version
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant