ISSUE #3028 - Eliminate Double Iteration on Entity Filtering and Push Regex Filters to API#26709
ISSUE #3028 - Eliminate Double Iteration on Entity Filtering and Push Regex Filters to API#26709TeddyCr wants to merge 8 commits intoopen-metadata:mainfrom
Conversation
|
| schema_filter = _build_regex_from_filter(self.source_config.schemaFilterPattern) | ||
| table_filter = _build_regex_from_filter(self.source_config.tableFilterPattern) |
There was a problem hiding this comment.
💡 Performance: Redundant _build_regex_from_filter calls in hot path
_build_regex_from_filter is called multiple times with the same filter patterns: once in _build_table_params, once in _has_conflicting_filter_modes, and once in _filter_deferred_excludes (which runs per-table). These could be computed once and cached on the instance during __init__ to avoid rebuilding RegexFilter objects on every table iteration.
Suggested fix:
# In DatabaseFetcherStrategy.__init__, compute once:
def __init__(self, ...):
super().__init__(...)
self.source_config = cast(EntityFilterConfigInterface, self.source_config)
self._schema_filter = _build_regex_from_filter(self.source_config.schemaFilterPattern)
self._table_filter = _build_regex_from_filter(self.source_config.tableFilterPattern)
self._conflicting_modes = (
self._schema_filter is not None
and self._table_filter is not None
and self._schema_filter.mode != self._table_filter.mode
)
# Then reference self._schema_filter, self._table_filter, self._conflicting_modes
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
Code Review 👍 Approved with suggestions 0 resolved / 1 findingsOptimizes entity filtering by pushing regex filters to the API and eliminating double iteration in the ingestion pipeline. Consider reducing redundant _build_regex_from_filter calls in the hot path by caching results across _build_table_params, _has_conflicting_filter_modes, and _filter_deferred_excludes. 💡 Performance: Redundant _build_regex_from_filter calls in hot path📄 ingestion/src/metadata/profiler/source/fetcher/fetcher_strategy.py:208-209 📄 ingestion/src/metadata/profiler/source/fetcher/fetcher_strategy.py:233-235 📄 ingestion/src/metadata/profiler/source/fetcher/fetcher_strategy.py:245-246
Suggested fix🤖 Prompt for agentsOptionsAuto-apply is off → Gitar will not commit updates to this branch. Comment with these commands to change:
Was this helpful? React with 👍 / 👎 | Gitar |
There was a problem hiding this comment.
Pull request overview
This PR optimizes profiler ingestion entity listing by pushing database/schema/table filter patterns down to the backend list APIs as regex query parameters, reducing client-side filtering and network transfer.
Changes:
- Added
databaseRegex,databaseSchemaRegex, andtableRegexquery params (plusregexModeandregexFilterByFqn) to Database/DatabaseSchema/Table list endpoints. - Implemented backend regex filtering in
ListFilterfor MySQL and PostgreSQL. - Updated ingestion fetcher strategy to translate
FilterPatternconfig into API query params and added integration/unit tests for the new behavior.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| openmetadata-spec/src/main/resources/json/schema/type/regexMode.json | Adds RegexMode schema to support include/exclude regex behavior. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseResource.java | Adds databaseRegex, regexMode, regexFilterByFqn to DB list endpoint. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/DatabaseSchemaResource.java | Adds databaseSchemaRegex, regexMode, regexFilterByFqn to schema list endpoint. |
| openmetadata-service/src/main/java/org/openmetadata/service/resources/databases/TableResource.java | Adds databaseSchemaRegex, tableRegex, regexMode, regexFilterByFqn to table list endpoint. |
| openmetadata-service/src/main/java/org/openmetadata/service/jdbi3/ListFilter.java | Adds JSON-field regex condition generation for MySQL/Postgres and wires into list filtering. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DatabaseResourceIT.java | Adds integration tests for databaseRegex include/exclude behavior. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/DatabaseSchemaResourceIT.java | Adds integration tests for databaseSchemaRegex include/exclude behavior. |
| openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/TableResourceIT.java | Adds integration tests for table listing with schema/table regex and exclude mode. |
| ingestion/src/metadata/profiler/source/fetcher/fetcher_strategy.py | Translates ingestion filter patterns into server-side regex params; defers conflicting excludes to client-side. |
| ingestion/tests/unit/observability/profiler/test_workflow.py | Adds unit tests for regex param building helpers in profiler workflow. |
| ingestion/tests/unit/observability/profiler/test_entity_fetcher.py | Adds unit tests validating end-to-end param passing and deferred client-side filters. |
| private String getFqnRegexCondition(String tableName, String regex, String paramName) { | ||
| String fieldPath = queryParams.get(paramName + "RegexField"); | ||
| if (nullOrEmpty(fieldPath)) { | ||
| fieldPath = "name"; | ||
| } | ||
| if (Boolean.parseBoolean(queryParams.get("regexFilterByFqn"))) { | ||
| int lastDot = fieldPath.lastIndexOf(".name"); | ||
| if (lastDot == -1) { | ||
| fieldPath = "fullyQualifiedName"; | ||
| } else { | ||
| fieldPath = fieldPath.substring(0, lastDot) + ".fullyQualifiedName"; | ||
| } | ||
| } | ||
| boolean exclude = RegexMode.EXCLUDE.value().equalsIgnoreCase(queryParams.get("regexMode")); | ||
| queryParams.put(paramName + "Regex", regex); | ||
| if (Boolean.TRUE.equals(DatasourceConfig.getInstance().isMySQL())) { | ||
| String expr = | ||
| tableName == null | ||
| ? String.format("JSON_UNQUOTE(JSON_EXTRACT(json, '$.%s'))", fieldPath) | ||
| : String.format("JSON_UNQUOTE(JSON_EXTRACT(%s.json, '$.%s'))", tableName, fieldPath); | ||
| String operator = exclude ? "NOT REGEXP" : "REGEXP"; | ||
| return String.format("%s %s :%s", expr, operator, paramName + "Regex"); | ||
| } else { | ||
| String pgPath = "{" + fieldPath.replace(".", ",") + "}"; | ||
| String expr = | ||
| tableName == null | ||
| ? String.format("json #>> '%s'", pgPath) | ||
| : String.format("%s.json #>> '%s'", tableName, pgPath); | ||
| String operator = exclude ? "!~" : "~"; | ||
| return String.format("%s %s :%s", expr, operator, paramName + "Regex"); | ||
| } |
There was a problem hiding this comment.
PostgreSQL regex matching uses ~/!~, which is case-sensitive and also behaves like a search (matches anywhere) rather than Python’s re.match semantics used in ingestion filters (anchored at the start and IGNORECASE). If this is intended to preserve ingestion filter behavior when pushing patterns to the API, consider switching to case-insensitive operators (~*/!~*) and anchoring the pattern (e.g., prefixing with ^ when not already anchored) so server-side results match client-side filtering.
| public String getDatabaseCondition(String tableName) { | ||
| String database = queryParams.get("database"); | ||
| return database == null ? "" : getFqnPrefixCondition(tableName, database, "database"); | ||
| String databaseRegex = queryParams.get("databaseRegex"); | ||
| if (nullOrEmpty(database) && nullOrEmpty(databaseRegex)) { | ||
| return ""; | ||
| } | ||
| String hashCondition = "True"; | ||
| String regexCondition = "True"; | ||
| if (!nullOrEmpty(database)) { | ||
| hashCondition = getFqnPrefixCondition(tableName, database, "database"); | ||
| } | ||
| if (!nullOrEmpty(databaseRegex)) { | ||
| regexCondition = getFqnRegexCondition(tableName, databaseRegex, "database"); | ||
| } | ||
| return String.format("(%s AND %s)", hashCondition, regexCondition); | ||
| } |
There was a problem hiding this comment.
hashCondition/regexCondition use the literal string "True". Elsewhere this class uses "TRUE" (e.g., WHERE TRUE). For consistency and to avoid any dialect quirks, prefer "TRUE" here as well.
| "Filter schemas by regex pattern. For better performance use in combination with database query filter", | ||
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) |
There was a problem hiding this comment.
The databaseSchemaRegex parameter’s example looks like a fully-qualified pattern (snowflakeWestCoast.financeDB.*), but by default this filter is applied to the schema name field unless regexFilterByFqn=true is also set. Consider adjusting the example and/or description to reflect the default behavior (name-only) and when FQN matching applies.
| "Filter schemas by regex pattern. For better performance use in combination with database query filter", | |
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) | |
| "Filter schemas by regex pattern applied to the schema `name` by default. To apply the regex to the fullyQualifiedName instead, set `regexFilterByFqn=true`. For better performance use in combination with the `database` query filter.", | |
| schema = @Schema(type = "string", example = "finance_.*")) |
| "Filter database by regex pattern. For better performance use in combination with service query filter", | ||
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) |
There was a problem hiding this comment.
The databaseRegex parameter’s example looks like a fully-qualified pattern (snowflakeWestCoast.financeDB.*), but by default this filter is applied to the database name field unless regexFilterByFqn=true is also set. Consider adjusting the example and/or description to reflect the default behavior (name-only) and when FQN matching applies.
| "Filter database by regex pattern. For better performance use in combination with service query filter", | |
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) | |
| "Filter databases by regex pattern. By default, the pattern is applied to the database name field " | |
| + "(for example, 'financeDB.*'). To filter by fullyQualifiedName instead, set 'regexFilterByFqn=true' " | |
| + "and use an FQN-style pattern (for example, 'snowflakeWestCoast.financeDB.*'). For better performance, " | |
| + "use in combination with the 'service' query filter.", | |
| schema = @Schema(type = "string", example = "financeDB.*")) |
| "Filter tables by database schema regex pattern. For better performance use in combination with database query filter", | ||
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) | ||
| @QueryParam("databaseSchemaRegex") | ||
| String databaseSchemaParamRegex, | ||
| @Parameter( | ||
| description = | ||
| "Filter tables by table regex pattern. For better performance use in combination with database and/or databaseSchema query filter", | ||
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) |
There was a problem hiding this comment.
Both databaseSchemaRegex and tableRegex examples look like fully-qualified patterns (snowflakeWestCoast.financeDB.*), but by default databaseSchemaRegex matches databaseSchema.name and tableRegex matches name unless regexFilterByFqn=true is also set. Consider adjusting examples/descriptions so they match the default behavior and clarify how to use FQN matching.
| "Filter tables by database schema regex pattern. For better performance use in combination with database query filter", | |
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) | |
| @QueryParam("databaseSchemaRegex") | |
| String databaseSchemaParamRegex, | |
| @Parameter( | |
| description = | |
| "Filter tables by table regex pattern. For better performance use in combination with database and/or databaseSchema query filter", | |
| schema = @Schema(type = "string", example = "snowflakeWestCoast.financeDB.*")) | |
| "Filter tables by database schema regex pattern applied to databaseSchema.name by default. " | |
| + "To apply the regex to the fully qualified name, set regexFilterByFqn=true. " | |
| + "For better performance, use this in combination with the database query filter.", | |
| schema = @Schema(type = "string", example = "finance_schema_.*")) | |
| @QueryParam("databaseSchemaRegex") | |
| String databaseSchemaParamRegex, | |
| @Parameter( | |
| description = | |
| "Filter tables by table regex pattern applied to the table name by default. " | |
| + "To apply the regex to the table fully qualified name, set regexFilterByFqn=true. " | |
| + "For better performance, use this in combination with the database and/or databaseSchema query filters.", | |
| schema = @Schema(type = "string", example = "orders_.*")) |
| filter.addQueryParam("tableParamRegex", tableParamRegex); | ||
| filter.addQueryParam("tableParamRegexField", "name"); |
There was a problem hiding this comment.
The table regex is received as query param tableRegex, but is stored into the ListFilter under the key tableParamRegex/tableParamRegexField. This is inconsistent with databaseRegex/databaseSchemaRegex naming and makes tracing request params harder. Consider renaming the internal keys to tableRegex/tableRegexField and updating ListFilter.getTableNameRegexCondition() accordingly.
| filter.addQueryParam("tableParamRegex", tableParamRegex); | |
| filter.addQueryParam("tableParamRegexField", "name"); | |
| filter.addQueryParam("tableRegex", tableParamRegex); | |
| filter.addQueryParam("tableRegexField", "name"); |
| raise ValueError( | ||
| "databaseFilterPattern returned 0 result. At least 1 database must be returned by the filter pattern." | ||
| f"\n\t- includes: {self.source_config.databaseFilterPattern.includes if self.source_config.databaseFilterPattern else None}" # pylint: disable=line-too-long | ||
| f"\n\t- excludes: {self.source_config.databaseFilterPattern.excludes if self.source_config.databaseFilterPattern else None}" # pylint: disable=line-too-long |
There was a problem hiding this comment.
When no databases are returned, this always raises an error saying "databaseFilterPattern returned 0 result" even when databaseFilterPattern is not configured (i.e., the service legitimately has no databases or the service filter is wrong). Consider emitting a different message when databaseFilterPattern is null vs when a configured pattern filtered everything out, so the error is actionable.
| raise ValueError( | |
| "databaseFilterPattern returned 0 result. At least 1 database must be returned by the filter pattern." | |
| f"\n\t- includes: {self.source_config.databaseFilterPattern.includes if self.source_config.databaseFilterPattern else None}" # pylint: disable=line-too-long | |
| f"\n\t- excludes: {self.source_config.databaseFilterPattern.excludes if self.source_config.databaseFilterPattern else None}" # pylint: disable=line-too-long | |
| db_filter_pattern = self.source_config.databaseFilterPattern | |
| if db_filter_pattern: | |
| raise ValueError( | |
| "databaseFilterPattern returned 0 result. At least 1 database must be returned by the filter pattern." | |
| f"\n\t- includes: {db_filter_pattern.includes}" # pylint: disable=line-too-long | |
| f"\n\t- excludes: {db_filter_pattern.excludes}" # pylint: disable=line-too-long | |
| ) | |
| raise ValueError( | |
| "No databases were returned for the configured service. " | |
| "Either the service has no databases, or the service configuration is incorrect." |
🛡️ TRIVY SCAN RESULT 🛡️ Target:
|
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.12.7 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.12.7 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.13.4 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.13.4 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.15.2 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42003 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4.2 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42004 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4 |
com.google.code.gson:gson |
CVE-2022-25647 | 🚨 HIGH | 2.2.4 | 2.8.9 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.3.0 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.3.0 | 3.25.5, 4.27.5, 4.28.2 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.7.1 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.7.1 | 3.25.5, 4.27.5, 4.28.2 |
com.nimbusds:nimbus-jose-jwt |
CVE-2023-52428 | 🚨 HIGH | 9.8.1 | 9.37.2 |
com.squareup.okhttp3:okhttp |
CVE-2021-0341 | 🚨 HIGH | 3.12.12 | 4.9.2 |
commons-beanutils:commons-beanutils |
CVE-2025-48734 | 🚨 HIGH | 1.9.4 | 1.11.0 |
commons-io:commons-io |
CVE-2024-47554 | 🚨 HIGH | 2.8.0 | 2.14.0 |
dnsjava:dnsjava |
CVE-2024-25638 | 🚨 HIGH | 2.1.7 | 3.6.0 |
io.airlift:aircompressor |
CVE-2025-67721 | 🚨 HIGH | 0.27 | 2.0.3 |
io.netty:netty-codec-http2 |
CVE-2025-55163 | 🚨 HIGH | 4.1.96.Final | 4.2.4.Final, 4.1.124.Final |
io.netty:netty-codec-http2 |
GHSA-xpw8-rcwv-8f8p | 🚨 HIGH | 4.1.96.Final | 4.1.100.Final |
io.netty:netty-handler |
CVE-2025-24970 | 🚨 HIGH | 4.1.96.Final | 4.1.118.Final |
net.minidev:json-smart |
CVE-2021-31684 | 🚨 HIGH | 1.3.2 | 1.3.3, 2.4.4 |
net.minidev:json-smart |
CVE-2023-1370 | 🚨 HIGH | 1.3.2 | 2.4.9 |
org.apache.avro:avro |
CVE-2024-47561 | 🔥 CRITICAL | 1.7.7 | 1.11.4 |
org.apache.avro:avro |
CVE-2023-39410 | 🚨 HIGH | 1.7.7 | 1.11.3 |
org.apache.derby:derby |
CVE-2022-46337 | 🔥 CRITICAL | 10.14.2.0 | 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0 |
org.apache.ivy:ivy |
CVE-2022-46751 | 🚨 HIGH | 2.5.1 | 2.5.2 |
org.apache.mesos:mesos |
CVE-2018-1330 | 🚨 HIGH | 1.4.3 | 1.6.0 |
org.apache.spark:spark-core_2.12 |
CVE-2025-54920 | 🚨 HIGH | 3.5.6 | 3.5.7 |
org.apache.thrift:libthrift |
CVE-2019-0205 | 🚨 HIGH | 0.12.0 | 0.13.0 |
org.apache.thrift:libthrift |
CVE-2020-13949 | 🚨 HIGH | 0.12.0 | 0.14.0 |
org.apache.zookeeper:zookeeper |
CVE-2023-44981 | 🔥 CRITICAL | 3.6.3 | 3.7.2, 3.8.3, 3.9.1 |
org.eclipse.jetty:jetty-server |
CVE-2024-13009 | 🚨 HIGH | 9.4.56.v20240826 | 9.4.57.v20241219 |
org.lz4:lz4-java |
CVE-2025-12183 | 🚨 HIGH | 1.8.0 | 1.8.1 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Node.js
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: Python
Vulnerabilities (13)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
apache-airflow |
CVE-2025-68438 | 🚨 HIGH | 3.1.5 | 3.1.6 |
apache-airflow |
CVE-2025-68675 | 🚨 HIGH | 3.1.5 | 3.1.6, 2.11.1 |
apache-airflow |
CVE-2026-26929 | 🚨 HIGH | 3.1.5 | 3.1.8 |
apache-airflow |
CVE-2026-28779 | 🚨 HIGH | 3.1.5 | 3.1.8 |
apache-airflow |
CVE-2026-30911 | 🚨 HIGH | 3.1.5 | 3.1.8 |
cryptography |
CVE-2026-26007 | 🚨 HIGH | 42.0.8 | 46.0.5 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 6.0.1 | 6.1.0 |
pyOpenSSL |
CVE-2026-27459 | 🚨 HIGH | 24.1.0 | 26.0.0 |
starlette |
CVE-2025-62727 | 🚨 HIGH | 0.48.0 | 0.49.1 |
urllib3 |
CVE-2025-66418 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2025-66471 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2026-21441 | 🚨 HIGH | 1.26.20 | 2.6.3 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: /etc/ssl/private/ssl-cert-snakeoil.key
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/extended_sample_data.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/lineage.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data.json
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_data_aut.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage.json
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /ingestion/pipelines/sample_usage_aut.yaml
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️ Target:
|
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
libpam-modules |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam-modules-bin |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam-runtime |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
libpam0g |
CVE-2025-6020 | 🚨 HIGH | 1.5.2-6+deb12u1 | 1.5.2-6+deb12u2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Java
Vulnerabilities (39)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.12.7 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.12.7 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
CVE-2025-52999 | 🚨 HIGH | 2.13.4 | 2.15.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.13.4 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.15.2 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-core |
GHSA-72hv-8253-57qq | 🚨 HIGH | 2.16.1 | 2.18.6, 2.21.1, 3.1.0 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42003 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4.2 |
com.fasterxml.jackson.core:jackson-databind |
CVE-2022-42004 | 🚨 HIGH | 2.12.7 | 2.12.7.1, 2.13.4 |
com.google.code.gson:gson |
CVE-2022-25647 | 🚨 HIGH | 2.2.4 | 2.8.9 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.3.0 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.3.0 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.3.0 | 3.25.5, 4.27.5, 4.28.2 |
com.google.protobuf:protobuf-java |
CVE-2021-22569 | 🚨 HIGH | 3.7.1 | 3.16.1, 3.18.2, 3.19.2 |
com.google.protobuf:protobuf-java |
CVE-2022-3509 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2022-3510 | 🚨 HIGH | 3.7.1 | 3.16.3, 3.19.6, 3.20.3, 3.21.7 |
com.google.protobuf:protobuf-java |
CVE-2024-7254 | 🚨 HIGH | 3.7.1 | 3.25.5, 4.27.5, 4.28.2 |
com.nimbusds:nimbus-jose-jwt |
CVE-2023-52428 | 🚨 HIGH | 9.8.1 | 9.37.2 |
com.squareup.okhttp3:okhttp |
CVE-2021-0341 | 🚨 HIGH | 3.12.12 | 4.9.2 |
commons-beanutils:commons-beanutils |
CVE-2025-48734 | 🚨 HIGH | 1.9.4 | 1.11.0 |
commons-io:commons-io |
CVE-2024-47554 | 🚨 HIGH | 2.8.0 | 2.14.0 |
dnsjava:dnsjava |
CVE-2024-25638 | 🚨 HIGH | 2.1.7 | 3.6.0 |
io.airlift:aircompressor |
CVE-2025-67721 | 🚨 HIGH | 0.27 | 2.0.3 |
io.netty:netty-codec-http2 |
CVE-2025-55163 | 🚨 HIGH | 4.1.96.Final | 4.2.4.Final, 4.1.124.Final |
io.netty:netty-codec-http2 |
GHSA-xpw8-rcwv-8f8p | 🚨 HIGH | 4.1.96.Final | 4.1.100.Final |
io.netty:netty-handler |
CVE-2025-24970 | 🚨 HIGH | 4.1.96.Final | 4.1.118.Final |
net.minidev:json-smart |
CVE-2021-31684 | 🚨 HIGH | 1.3.2 | 1.3.3, 2.4.4 |
net.minidev:json-smart |
CVE-2023-1370 | 🚨 HIGH | 1.3.2 | 2.4.9 |
org.apache.avro:avro |
CVE-2024-47561 | 🔥 CRITICAL | 1.7.7 | 1.11.4 |
org.apache.avro:avro |
CVE-2023-39410 | 🚨 HIGH | 1.7.7 | 1.11.3 |
org.apache.derby:derby |
CVE-2022-46337 | 🔥 CRITICAL | 10.14.2.0 | 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0 |
org.apache.ivy:ivy |
CVE-2022-46751 | 🚨 HIGH | 2.5.1 | 2.5.2 |
org.apache.mesos:mesos |
CVE-2018-1330 | 🚨 HIGH | 1.4.3 | 1.6.0 |
org.apache.spark:spark-core_2.12 |
CVE-2025-54920 | 🚨 HIGH | 3.5.6 | 3.5.7 |
org.apache.thrift:libthrift |
CVE-2019-0205 | 🚨 HIGH | 0.12.0 | 0.13.0 |
org.apache.thrift:libthrift |
CVE-2020-13949 | 🚨 HIGH | 0.12.0 | 0.14.0 |
org.apache.zookeeper:zookeeper |
CVE-2023-44981 | 🔥 CRITICAL | 3.6.3 | 3.7.2, 3.8.3, 3.9.1 |
org.eclipse.jetty:jetty-server |
CVE-2024-13009 | 🚨 HIGH | 9.4.56.v20240826 | 9.4.57.v20241219 |
org.lz4:lz4-java |
CVE-2025-12183 | 🚨 HIGH | 1.8.0 | 1.8.1 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: Node.js
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: Python
Vulnerabilities (33)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
Authlib |
CVE-2026-27962 | 🔥 CRITICAL | 1.6.6 | 1.6.9 |
Authlib |
CVE-2026-28490 | 🚨 HIGH | 1.6.6 | 1.6.9 |
Authlib |
CVE-2026-28498 | 🚨 HIGH | 1.6.6 | 1.6.9 |
Authlib |
CVE-2026-28802 | 🚨 HIGH | 1.6.6 | 1.6.7 |
PyJWT |
CVE-2026-32597 | 🚨 HIGH | 2.10.1 | 2.12.0 |
Werkzeug |
CVE-2024-34069 | 🚨 HIGH | 2.2.3 | 3.0.3 |
aiohttp |
CVE-2025-69223 | 🚨 HIGH | 3.12.12 | 3.13.3 |
aiohttp |
CVE-2025-69223 | 🚨 HIGH | 3.13.2 | 3.13.3 |
apache-airflow |
CVE-2025-68438 | 🚨 HIGH | 3.1.5 | 3.1.6 |
apache-airflow |
CVE-2025-68675 | 🚨 HIGH | 3.1.5 | 3.1.6, 2.11.1 |
apache-airflow |
CVE-2026-26929 | 🚨 HIGH | 3.1.5 | 3.1.8 |
apache-airflow |
CVE-2026-28779 | 🚨 HIGH | 3.1.5 | 3.1.8 |
apache-airflow |
CVE-2026-30911 | 🚨 HIGH | 3.1.5 | 3.1.8 |
apache-airflow-providers-http |
CVE-2025-69219 | 🚨 HIGH | 5.6.0 | 6.0.0 |
azure-core |
CVE-2026-21226 | 🚨 HIGH | 1.37.0 | 1.38.0 |
cryptography |
CVE-2026-26007 | 🚨 HIGH | 42.0.8 | 46.0.5 |
google-cloud-aiplatform |
CVE-2026-2472 | 🚨 HIGH | 1.130.0 | 1.131.0 |
google-cloud-aiplatform |
CVE-2026-2473 | 🚨 HIGH | 1.130.0 | 1.133.0 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 5.3.0 | 6.1.0 |
jaraco.context |
CVE-2026-23949 | 🚨 HIGH | 6.0.1 | 6.1.0 |
protobuf |
CVE-2026-0994 | 🚨 HIGH | 4.25.8 | 6.33.5, 5.29.6 |
pyOpenSSL |
CVE-2026-27459 | 🚨 HIGH | 24.1.0 | 26.0.0 |
pyasn1 |
CVE-2026-23490 | 🚨 HIGH | 0.6.1 | 0.6.2 |
pyasn1 |
CVE-2026-30922 | 🚨 HIGH | 0.6.1 | 0.6.3 |
python-multipart |
CVE-2026-24486 | 🚨 HIGH | 0.0.20 | 0.0.22 |
ray |
CVE-2025-62593 | 🔥 CRITICAL | 2.47.1 | 2.52.0 |
starlette |
CVE-2025-62727 | 🚨 HIGH | 0.48.0 | 0.49.1 |
tornado |
CVE-2026-31958 | 🚨 HIGH | 6.5.3 | 6.5.5 |
urllib3 |
CVE-2025-66418 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2025-66471 | 🚨 HIGH | 1.26.20 | 2.6.0 |
urllib3 |
CVE-2026-21441 | 🚨 HIGH | 1.26.20 | 2.6.3 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
wheel |
CVE-2026-24049 | 🚨 HIGH | 0.45.1 | 0.46.2 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: usr/bin/docker
Vulnerabilities (4)
| Package | Vulnerability ID | Severity | Installed Version | Fixed Version |
|---|---|---|---|---|
stdlib |
CVE-2025-68121 | 🔥 CRITICAL | v1.25.5 | 1.24.13, 1.25.7, 1.26.0-rc.3 |
stdlib |
CVE-2025-61726 | 🚨 HIGH | v1.25.5 | 1.24.12, 1.25.6 |
stdlib |
CVE-2025-61728 | 🚨 HIGH | v1.25.5 | 1.24.12, 1.25.6 |
stdlib |
CVE-2026-25679 | 🚨 HIGH | v1.25.5 | 1.25.8, 1.26.1 |
🛡️ TRIVY SCAN RESULT 🛡️
Target: /etc/ssl/private/ssl-cert-snakeoil.key
No Vulnerabilities Found
🛡️ TRIVY SCAN RESULT 🛡️
Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO
No Vulnerabilities Found
|
🟡 Playwright Results — all passed (16 flaky)✅ 3417 passed · ❌ 0 failed · 🟡 16 flaky · ⏭️ 183 skipped
🟡 16 flaky test(s) (passed on retry)
How to debug locally# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip # view trace |



Describe your changes:
Summary
Changes by file
Backend (Java)
openmetadata-spec/.../type/regexMode.json(new)RegexModewith valuesincludeandexclude, used across all regex-filtered list endpointsopenmetadata-service/.../jdbi3/ListFilter.javagetDatabaseConditionandgetDatabaseSchemaConditionto support regex params alongside existing exact filters. AddedgetTableNameRegexConditionfor table name regex. Added coregetFqnRegexConditionmethod that builds database-specific regex SQL —JSON_UNQUOTE(JSON_EXTRACT(...)) REGEXPfor MySQL,json #>> '...' ~for PostgreSQL. Supports include/exclude viaRegexModeand FQN-based matching viaregexFilterByFqnopenmetadata-service/.../databases/DatabaseResource.javadatabaseRegex,regexFilterByFqn, andregexModequery parameters to the list endpoint. MapsdatabaseRegextoListFilterwith field hintdatabaseRegexField=nameopenmetadata-service/.../databases/DatabaseSchemaResource.javadatabaseSchemaRegex,regexFilterByFqn, andregexModequery parameters to the list endpoint. MapsdatabaseSchemaRegextoListFilterwith field hintdatabaseSchemaRegexField=nameopenmetadata-service/.../databases/TableResource.javadatabaseSchemaRegex,tableRegexregexFilterByFqn, andregexModequery parameters to the list endpoint. MapsdatabaseSchemaRegexwith field hintdatabaseSchema.nameandtableRegexastableParamRegexwith field hintnameIngestion (Python)
ingestion/.../fetcher/fetcher_strategy.pyRegexFiltermodel,_combine_patterns,_build_regex_from_filter,_build_database_params,_build_table_params,_has_conflicting_filter_modes,_filter_deferred_excludes. Refactored_get_database_entitiesand_get_table_entitiesfrom eager list + client-side filtering to generators that delegate filtering to the backend via regex params. Removed_filter_databases,_filter_schemas,_filter_tables,_filter_entities,_filter_column_metrics_computation— replaced by server-side filteringTests
ingestion/tests/.../test_entity_fetcher.pyTestGetDatabaseEntities(6 tests): include/exclude regex param building, multiple includes combined with OR, FQN filtering, no-filter passthrough, zero-result error. AddedTestGetTableEntities(11 tests): schema+table regex, view filtering, classification include/exclude, combined filters, FQN filtering, conflicting modes. AddedTestFetch(3 tests): end-to-end pipeline, client-side filters, per-database error handlingingestion/tests/.../test_workflow.pytest_filter_entitieswithtest_build_regex_from_filter,test_build_database_params,test_build_table_params. Refactoredtest_filter_classificationsto testfilter_classificationsdirectly.../it/tests/DatabaseResourceIT.java.../it/tests/DatabaseSchemaResourceIT.java.../it/tests/TableResourceIT.javaType of change:
Checklist:
Fixes <issue-number>: <short explanation>Improvement