Skip to content

add nlq to OpenMetadataApplicationConfig#27988

Merged
pmbrull merged 6 commits intomainfrom
add-nlq-config-settings
May 9, 2026
Merged

add nlq to OpenMetadataApplicationConfig#27988
pmbrull merged 6 commits intomainfrom
add-nlq-config-settings

Conversation

@lautel
Copy link
Copy Markdown
Contributor

@lautel lautel commented May 8, 2026


Summary by Gitar

  • Configuration updates:
    • Added nlqHybridSearch to OpenMetadataApplicationConfig.
    • Renamed maxConcurrentEmbeddingRequests to maxConcurrentRequests in configurations and code to support both embedding and NLQ providers.
  • Schema enhancements:
    • Expanded elasticSearchConfiguration.json with new tuning parameters for bedrock and openai (timeout, max tokens, temperature).
    • Added filterExtractor and hybridSearch settings to support advanced NLQ and search pipeline configurations.

This will update automatically on new commits.

@lautel lautel self-assigned this May 8, 2026
@lautel lautel added the safe to test Add this label to run secure Github workflows on PRs label May 8, 2026
dimension,
endpoint,
isAzure,
new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: Instantiating config object just to read a default value

The change replaces DEFAULT_MAX_CONCURRENT_REQUESTS (a clear constant with value 10) with new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests(). This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in EmbeddingClient. The parent class EmbeddingClient already has resolveMaxConcurrent() which falls back to DEFAULT_MAX_CONCURRENT_REQUESTS when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

Suggested fix:

Replace with the existing constant:
    this(
        httpClient,
        apiKey,
        modelId,
        dimension,
        endpoint,
        isAzure,
        DEFAULT_MAX_CONCURRENT_REQUESTS);

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions Bot requested a review from a team as a code owner May 8, 2026 13:53
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

🔴 Playwright Results — 1 failure(s), 15 flaky

✅ 4069 passed · ❌ 1 failed · 🟡 15 flaky · ⏭️ 86 skipped

Shard Passed Failed Flaky Skipped
✅ Shard 1 299 0 0 4
🟡 Shard 2 757 0 11 8
🔴 Shard 3 779 1 1 7
🟡 Shard 4 789 0 1 18
🟡 Shard 5 708 0 1 41
🟡 Shard 6 737 0 1 8

Genuine Failures (failed on all attempts)

Flow/ObservabilityAlerts.spec.ts › Alert operations for a user with and without permissions (shard 3)
�[31mTest timeout of 60000ms exceeded.�[39m
🟡 15 flaky test(s) (passed on retry)
  • Features/ActivityAPI.spec.ts › Activity event shows the actor who made the change (shard 2, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary Term (Nested) (shard 2, 2 retries)
  • Features/BulkImport.spec.ts › Keyboard Delete selection (shard 2, 1 retry)
  • Features/DataQuality/ColumnLevelTests.spec.ts › Column Values To Be Not In Set (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 2 retries)
  • Features/KnowledgeCenter.spec.ts › Article mentions in description should working for Knowledge Center (shard 2, 1 retry)
  • Features/KnowledgeCenterList.spec.ts › Knowledge Center List - Test add article button (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/KnowledgeCenterTextEditor.spec.ts › Rich Text Editor - Text Formatting (shard 2, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Pages/DataInsightReportApplication.spec.ts › Uninstall application (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Topic (shard 5, 1 retry)
  • Pages/Lineage/LineageFilters.spec.ts › Verify lineage schema filter selection (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@gitar-bot
Copy link
Copy Markdown

gitar-bot Bot commented May 9, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Integrates NLQ configuration into the application core and updates client settings for Bedrock and OpenAI providers. Refactor the instantiation of NaturalLanguageSearchConfiguration to avoid object creation when retrieving default values.

💡 Quality: Instantiating config object just to read a default value

📄 openmetadata-service/src/main/java/org/openmetadata/service/search/vector/client/OpenAIEmbeddingClient.java:88

The change replaces DEFAULT_MAX_CONCURRENT_REQUESTS (a clear constant with value 10) with new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests(). This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in EmbeddingClient. The parent class EmbeddingClient already has resolveMaxConcurrent() which falls back to DEFAULT_MAX_CONCURRENT_REQUESTS when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

Suggested fix
Replace with the existing constant:
    this(
        httpClient,
        apiKey,
        modelId,
        dimension,
        endpoint,
        isAzure,
        DEFAULT_MAX_CONCURRENT_REQUESTS);
✅ 1 resolved
Quality: Untyped JsonNode config loses validation and discoverability

📄 openmetadata-service/src/main/java/org/openmetadata/service/OpenMetadataApplicationConfig.java:83-84
The nlqHybridSearch field uses raw JsonNode, which bypasses Dropwizard's @Valid/@NotNull constraint propagation and provides no compile-time contract for consumers. If the structure is known (even partially), a dedicated configuration record/class would enable validation at startup, self-document expected keys, and align with the typed pattern used by every other field in this class (e.g., ElasticSearchConfiguration, AuthenticationConfiguration).

🤖 Prompt for agents
Code Review: Integrates NLQ configuration into the application core and updates client settings for Bedrock and OpenAI providers. Refactor the instantiation of NaturalLanguageSearchConfiguration to avoid object creation when retrieving default values.

1. 💡 Quality: Instantiating config object just to read a default value
   Files: openmetadata-service/src/main/java/org/openmetadata/service/search/vector/client/OpenAIEmbeddingClient.java:88

   The change replaces `DEFAULT_MAX_CONCURRENT_REQUESTS` (a clear constant with value 10) with `new NaturalLanguageSearchConfiguration().getMaxConcurrentRequests()`. This instantiates a full configuration POJO solely to retrieve a JSON-schema default value. This is less readable and less efficient than referencing the existing constant in `EmbeddingClient`. The parent class `EmbeddingClient` already has `resolveMaxConcurrent()` which falls back to `DEFAULT_MAX_CONCURRENT_REQUESTS` when the config value is null/non-positive, so the indirection through a fresh config object is unnecessary.

   Suggested fix:
   Replace with the existing constant:
       this(
           httpClient,
           apiKey,
           modelId,
           dimension,
           endpoint,
           isAzure,
           DEFAULT_MAX_CONCURRENT_REQUESTS);

Options

Display: compact → Showing less information.

Comment with these commands to change:

Compact
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 9, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 62%
62.32% (64259/103106) 42.92% (34848/81175) 45.72% (10256/22428)

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 9, 2026

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 9, 2026

@pmbrull pmbrull merged commit 882ef3f into main May 9, 2026
63 of 68 checks passed
@pmbrull pmbrull deleted the add-nlq-config-settings branch May 9, 2026 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backward-Incompatible-Change safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants