-
Notifications
You must be signed in to change notification settings - Fork 27
cg0370 #1404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cg0370 #1404
Conversation
RamilCDISC
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update Operations.md file for example with value_is_reference too please.
|
updated docs @RamilCDISC |
RamilCDISC
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR adds support for using reference to distinct operation. The PR was validated by:
- Reviewing the updated code for any unwanted code or comments.
- Reviewing the updated logic in accordance with the Issue description.
- Ensuring all relevant tests are updated and new tests are added.
- Ensuring all related documentation and schema files are updated.
- Ensuring new tests are added if necessary.
- Validating the implementation using dev editor running cg0370 against negative datasets.
- Validating the implementation using dev editor running cg0370 against positive datasets.
- Testing the edge cases of missing column and null values.
This pull request introduces support for using reference values in the
Distinctoperation, allowing the operation to treat the value in the target column as a reference to another column in the same row. This is controlled via a newvalue_is_referenceboolean parameter, which is now supported throughout the operation pipeline. The changes also include updates to the schema, new tests for this functionality, and a minor improvement to domain matching logic.Datasets.json
Rule_underscores.json
this is one of CG0370 sub-rules using this logic and negative data for it.
Key changes:
Distinct Operation Reference Value Support:
value_is_referenceboolean parameter toOperationParams, and updated theDistinctoperation logic to use the value in the target column as a reference to another column when this flag is set. This includes support for both grouped and ungrouped distinct operations. [1] [2]Operations.json) to include the newvalue_is_referenceparameter.value_is_referenceparameter when constructing operation parameters.Testing:
Distinctoperation, covering both Pandas and Dask datasets.Domain Matching Logic:
--) to all domains with a common prefix.