Skip to content

Conversation

@SFJohnson24
Copy link
Collaborator

@SFJohnson24 SFJohnson24 commented Nov 7, 2025

test data:
pos.json
Rule_underscores.json
neg.json

This PR adds wildcard logic for all operations as well as allowing the regex argument = for the above rule.

This pull request introduces improvements to how domain wildcards (like --SEQ) are replaced with concrete domain-specific variable names (like AESEQ) across the codebase, especially in rule processing and operation parameter handling. It also adds support for the just_date parameter in uniqueness checks, refines test coverage, and updates schema and documentation for new functionality.

Domain wildcard replacement and parameter preprocessing

  • Added a new _preprocess_operation_params method to RuleProcessor that performs shallow copying and recursive wildcard replacement for operation parameters, ensuring domain-specific variable names are used without mutating the original input objects. This method also handles supplemental domains by switching to rdomain when appropriate. (cdisc_rules_engine/utilities/rule_processor.py)
  • Updated the rule processor to call _preprocess_operation_params before executing operations, ensuring all parameters are correctly formatted for the current domain. (cdisc_rules_engine/utilities/rule_processor.py)
  • Refactored tests to cover the new preprocessing logic, including cases for wildcard replacement and supplemental domain handling. (tests/unit/test_utilities/test_rule_processor.py)

Uniqueness check enhancements

  • Enhanced the is_unique_set operator to support a new just_date parameter, which allows uniqueness checks to be performed on just the date portion of datetime values, ignoring time. (cdisc_rules_engine/check_operators/dataframe_operators.py)
  • Updated the operator schema and documentation to include the new just_date parameter, clarifying its usage for users. (resources/schema/Operator.json, resources/schema/Operator.md) [1] [2]

Minor fixes and test improvements

  • Fixed variable naming in the variable_is_null operation to avoid unnecessary domain prefix replacement, simplifying the logic. (cdisc_rules_engine/operations/variable_is_null.py, tests/unit/test_operations/test_variable_is_null.py) [1] [2]
  • Updated grouping logic in record count operations to use the correct dataset reference. (cdisc_rules_engine/operations/record_count.py)
  • Refactored test setup to use consistent dataset metadata objects and improved test coverage for rule operations. (tests/unit/test_utilities/test_rule_processor.py) [1] [2] [3] [4]

These changes collectively improve the accuracy, maintainability, and usability of rule processing and dataset operations in the codebase.

@SFJohnson24 SFJohnson24 marked this pull request as ready for review November 10, 2025 19:33
Copy link
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR updates the way '--' are handled and adds regex logic. The PR was validated by:

  1. Reviewing the PR for any unwanted code or comment.
  2. Reviewing the updated logic in accordance with AC.
  3. Ensuring all unit and regression tests pass.
  4. Ensuring all relevant testing is updated.
  5. Validating the AC using positive datastes in dev editor.
  6. Validating the AC using negative datasets in dev editor.
  7. Validating the edge cases related to regex.

@RamilCDISC RamilCDISC merged commit 151910f into main Nov 13, 2025
12 checks passed
@RamilCDISC RamilCDISC deleted the CG0562 branch November 13, 2025 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants