Skip to content

fix(QTDI-2420): Change discover schema action order for input in TaCoKitGuessSchema#1164

Open
yyin-talend wants to merge 7 commits intomasterfrom
fix/QTDI-2420-DiscoverSchemaExtended
Open

fix(QTDI-2420): Change discover schema action order for input in TaCoKitGuessSchema#1164
yyin-talend wants to merge 7 commits intomasterfrom
fix/QTDI-2420-DiscoverSchemaExtended

Conversation

@yyin-talend
Copy link
Contributor

@yyin-talend yyin-talend commented Feb 4, 2026

@yyin-talend yyin-talend requested a review from undx February 4, 2026 06:54
@yyin-talend yyin-talend changed the title Fix(QTDI-2420):Change discover schema action order for input in TaCoKitGuessSchema fix(QTDI-2420):Change discover schema action order for input in TaCoKitGuessSchema Feb 4, 2026
@sonar-eks
Copy link

sonar-eks bot commented Feb 4, 2026

undx
undx previously approved these changes Feb 4, 2026
Copy link
Contributor

@ypiel-talend ypiel-talend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ypiel-talend ypiel-talend requested a review from undx March 9, 2026 09:18
Copy link
Contributor

@ozhelezniak-talend ozhelezniak-talend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, but it really good to just rewrite this part completely...

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates schema discovery selection for DI input components to prioritize @DiscoverSchemaExtended before @DiscoverSchema, aligning runtime behavior with the QTDI-2420 expectation and updating the related Studio documentation.

Changes:

  • Reordered inferred action lookup to try schema_extended (DiscoverSchemaExtended) before schema (DiscoverSchema) when action is empty.
  • When action is explicitly provided, allow both schema_extended and schema types and prefer schema_extended.
  • Updated documentation to reflect the new action selection order for inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
documentation/src/main/antora/modules/ROOT/pages/studio-schema.adoc Adjusts documented input action-selection order to prefer @DiscoverSchemaExtended before @DiscoverSchema.
component-studio/component-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java Changes runtime action-selection order and selection criteria to prioritize and support both schema action types.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@ozhelezniak-talend
Copy link
Contributor

Could you check Copilot comments, some of them looks valid

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
undx and others added 2 commits March 16, 2026 17:28
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@ozhelezniak-talend ozhelezniak-talend left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ozhelezniak-talend ozhelezniak-talend changed the title fix(QTDI-2420):Change discover schema action order for input in TaCoKitGuessSchema fix(QTDI-2420): Change discover schema action order for input in TaCoKitGuessSchema Mar 20, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates schema discovery selection to prefer @DiscoverSchemaExtended over @DiscoverSchema for input schema guessing in TaCoKitGuessSchema, and aligns (part of) the documentation with that updated selection order.

Changes:

  • Reordered input action lookup to try schema_extended before schema when inferring the action name from the dataset.
  • When an action name is explicitly provided, allow both schema and schema_extended and prefer schema_extended when both exist.
  • Updated the Studio schema documentation’s “For inputs” selection order accordingly.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
documentation/src/main/antora/modules/ROOT/pages/studio-schema.adoc Updates documented action selection order for input schema discovery.
component-studio/component-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java Adjusts runtime action selection to prefer schema_extended and support both action types for the same name.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 446 to +461
actionRef = services
.stream()
.flatMap(s -> s.getActions().stream())
.filter(a -> a.getFamily().equals(family) && a.getAction().equals(action)
&& a.getType().equals(SCHEMA_TYPE))
&& (a.getType().equals(SCHEMA_EXTENDED_TYPE) || a.getType().equals(SCHEMA_TYPE)))
// When both DiscoverSchemaExtended and DiscoverSchema exist for the same action name,
// prefer the extended schema action by ordering it first.
.sorted((action1, action2) -> {
boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE);
boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE);
return Boolean.compare(!action1IsExtended, !action2IsExtended);
})
.findFirst()
.orElseThrow(() -> new IllegalArgumentException(
"No action " + family + "#" + SCHEMA_TYPE + "#" + action));
"No action " + family + "#(" + SCHEMA_TYPE + " or " + SCHEMA_EXTENDED_TYPE + ")#"
+ action));
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New behavior changes action resolution to allow both schema and schema_extended when action is explicitly set and to prefer schema_extended when both exist. There are existing unit tests for TaCoKitGuessSchema, but none appear to cover the “both actions exist for the same name → pick extended” case. Adding a focused test would guard against regressions in this selection/ordering logic.

Copilot uses AI. Check for mistakes.
Comment on lines 245 to 246
.. search an action of type `@DiscoverSchema`.
. Execute a fake job with component to retrieve output schema.
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the “For inputs” action selection list, the last step says to “search an action of type @DiscoverSchema” (without matching the dataset name), but TaCoKitGuessSchema.guessSchemaThroughAction() doesn’t implement this fallback when action is empty (it returns false and falls back to the mock job). Update the documentation to reflect the actual behavior, or implement the documented fallback in code so the docs remain accurate.

Suggested change
.. search an action of type `@DiscoverSchema`.
. Execute a fake job with component to retrieve output schema.
. If no matching action is found, execute a fake job with the component to retrieve the output schema.

Copilot uses AI. Check for mistakes.
.orElse(null);
if (actionRef == null) {
// let's try DiscoverSchemaExtended action name
// second find DiscoverSchema action name
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is grammatically unclear: “second find DiscoverSchema action name”. Consider rephrasing (e.g., “Then try the DiscoverSchema action name”) so it’s easier to follow when debugging action selection.

Suggested change
// second find DiscoverSchema action name
// Then try the DiscoverSchema action name.

Copilot uses AI. Check for mistakes.
Comment on lines +453 to 458
.sorted((action1, action2) -> {
boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE);
boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE);
return Boolean.compare(!action1IsExtended, !action2IsExtended);
})
.findFirst()
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stream uses sorted(...).findFirst() just to prefer schema_extended over schema. This sorts the entire candidate list (O(n log n)) even though you only need the best match. Consider replacing it with min(comparator) (or equivalent) to select the preferred action in a single pass and keep intent clearer.

Suggested change
.sorted((action1, action2) -> {
boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE);
boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE);
return Boolean.compare(!action1IsExtended, !action2IsExtended);
})
.findFirst()
.min((action1, action2) -> {
boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE);
boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE);
return Boolean.compare(!action1IsExtended, !action2IsExtended);
})

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants