fix(QTDI-2420): Change discover schema action order for input in TaCoKitGuessSchema#1164
fix(QTDI-2420): Change discover schema action order for input in TaCoKitGuessSchema#1164yyin-talend wants to merge 7 commits intomasterfrom
Conversation
|
ypiel-talend
left a comment
There was a problem hiding this comment.
I think the TCK documentation has to be reviewed too:
https://talend.github.io/component-runtime/main/latest/studio-schema.html#_guess_schema_action_selection
ozhelezniak-talend
left a comment
There was a problem hiding this comment.
okay, but it really good to just rewrite this part completely...
There was a problem hiding this comment.
Pull request overview
Updates schema discovery selection for DI input components to prioritize @DiscoverSchemaExtended before @DiscoverSchema, aligning runtime behavior with the QTDI-2420 expectation and updating the related Studio documentation.
Changes:
- Reordered inferred action lookup to try
schema_extended(DiscoverSchemaExtended) beforeschema(DiscoverSchema) whenactionis empty. - When
actionis explicitly provided, allow bothschema_extendedandschematypes and preferschema_extended. - Updated documentation to reflect the new action selection order for inputs.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| documentation/src/main/antora/modules/ROOT/pages/studio-schema.adoc | Adjusts documented input action-selection order to prefer @DiscoverSchemaExtended before @DiscoverSchema. |
| component-studio/component-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java | Changes runtime action-selection order and selection criteria to prioritize and support both schema action types. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
...-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java
Outdated
Show resolved
Hide resolved
...-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java
Outdated
Show resolved
Hide resolved
...-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java
Show resolved
Hide resolved
|
Could you check Copilot comments, some of them looks valid |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates schema discovery selection to prefer @DiscoverSchemaExtended over @DiscoverSchema for input schema guessing in TaCoKitGuessSchema, and aligns (part of) the documentation with that updated selection order.
Changes:
- Reordered input action lookup to try
schema_extendedbeforeschemawhen inferring the action name from the dataset. - When an action name is explicitly provided, allow both
schemaandschema_extendedand preferschema_extendedwhen both exist. - Updated the Studio schema documentation’s “For inputs” selection order accordingly.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| documentation/src/main/antora/modules/ROOT/pages/studio-schema.adoc | Updates documented action selection order for input schema discovery. |
| component-studio/component-runtime-di/src/main/java/org/talend/sdk/component/runtime/di/schema/TaCoKitGuessSchema.java | Adjusts runtime action selection to prefer schema_extended and support both action types for the same name. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| actionRef = services | ||
| .stream() | ||
| .flatMap(s -> s.getActions().stream()) | ||
| .filter(a -> a.getFamily().equals(family) && a.getAction().equals(action) | ||
| && a.getType().equals(SCHEMA_TYPE)) | ||
| && (a.getType().equals(SCHEMA_EXTENDED_TYPE) || a.getType().equals(SCHEMA_TYPE))) | ||
| // When both DiscoverSchemaExtended and DiscoverSchema exist for the same action name, | ||
| // prefer the extended schema action by ordering it first. | ||
| .sorted((action1, action2) -> { | ||
| boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE); | ||
| boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE); | ||
| return Boolean.compare(!action1IsExtended, !action2IsExtended); | ||
| }) | ||
| .findFirst() | ||
| .orElseThrow(() -> new IllegalArgumentException( | ||
| "No action " + family + "#" + SCHEMA_TYPE + "#" + action)); | ||
| "No action " + family + "#(" + SCHEMA_TYPE + " or " + SCHEMA_EXTENDED_TYPE + ")#" | ||
| + action)); |
There was a problem hiding this comment.
New behavior changes action resolution to allow both schema and schema_extended when action is explicitly set and to prefer schema_extended when both exist. There are existing unit tests for TaCoKitGuessSchema, but none appear to cover the “both actions exist for the same name → pick extended” case. Adding a focused test would guard against regressions in this selection/ordering logic.
| .. search an action of type `@DiscoverSchema`. | ||
| . Execute a fake job with component to retrieve output schema. |
There was a problem hiding this comment.
In the “For inputs” action selection list, the last step says to “search an action of type @DiscoverSchema” (without matching the dataset name), but TaCoKitGuessSchema.guessSchemaThroughAction() doesn’t implement this fallback when action is empty (it returns false and falls back to the mock job). Update the documentation to reflect the actual behavior, or implement the documented fallback in code so the docs remain accurate.
| .. search an action of type `@DiscoverSchema`. | |
| . Execute a fake job with component to retrieve output schema. | |
| . If no matching action is found, execute a fake job with the component to retrieve the output schema. |
| .orElse(null); | ||
| if (actionRef == null) { | ||
| // let's try DiscoverSchemaExtended action name | ||
| // second find DiscoverSchema action name |
There was a problem hiding this comment.
This comment is grammatically unclear: “second find DiscoverSchema action name”. Consider rephrasing (e.g., “Then try the DiscoverSchema action name”) so it’s easier to follow when debugging action selection.
| // second find DiscoverSchema action name | |
| // Then try the DiscoverSchema action name. |
| .sorted((action1, action2) -> { | ||
| boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE); | ||
| boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE); | ||
| return Boolean.compare(!action1IsExtended, !action2IsExtended); | ||
| }) | ||
| .findFirst() |
There was a problem hiding this comment.
The stream uses sorted(...).findFirst() just to prefer schema_extended over schema. This sorts the entire candidate list (O(n log n)) even though you only need the best match. Consider replacing it with min(comparator) (or equivalent) to select the preferred action in a single pass and keep intent clearer.
| .sorted((action1, action2) -> { | |
| boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE); | |
| boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE); | |
| return Boolean.compare(!action1IsExtended, !action2IsExtended); | |
| }) | |
| .findFirst() | |
| .min((action1, action2) -> { | |
| boolean action1IsExtended = action1.getType().equals(SCHEMA_EXTENDED_TYPE); | |
| boolean action2IsExtended = action2.getType().equals(SCHEMA_EXTENDED_TYPE); | |
| return Boolean.compare(!action1IsExtended, !action2IsExtended); | |
| }) |

0 New Issues
0 Fixed Issues
0 Accepted Issues
https://qlik-dev.atlassian.net/browse/QTDI-2420