Skip to content

Conversation

@RakeshBobba03
Copy link
Collaborator

Fixed MultiIndex handling to properly detect when all index levels need to be removed. Instead of calling droplevel() on a MultiIndex with equal levels, the code now extracts the innermost level values and reconstructs a Series with the correct row index mapping.
Added checks to handle cases where groupby().apply() returns a DataFrame instead of a Series, extracting the first column to ensure consistent Series handling.
Updated check_basic_sort_order to enforce strict ordering by using >= for ascending checks and <= for descending checks. This correctly flags identical values (e.g., "DIAG", "DIAG") as violations, ensuring the operator detects missing sequence numbers as required by rules like CG0546.
Fixed index alignment issues that were masking errors. Both basic_sort_check and date_overlap_check are now reindexed to sorted_df.index before combining, ensuring all boolean Series are properly aligned and preventing fill_value=True from masking actual violations due to index mismatches.

… fix MultiIndex handling, and correct violation detection logic
Copy link
Collaborator

@RamilCDISC RamilCDISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was running a validation using the dataset and rule attached in the Issue by mhungria. In the dataset i changed the SM.MIDSTYPE from DIAGNOSIS to not DIAGNOSIS. When running validation in dev editor this throws an error as following:

{
  "SM": [
    {
      "executionStatus": "skipped",
      "dataset": "sm.xpt",
      "domain": "SM",
      "variables": [],
      "message": "rule evaluation error - operation failed",
      "errors": [
        {
          "dataset": "sm.xpt",
          "error": "Error occurred during operation execution",
          "message": "Failed to execute rule operation. Operation: record_count, Target: None, Domain: SM, Error: single positional indexer is out-of-bounds"
        }
      ]
    }
  ],
  "TM": [
    {
      "executionStatus": "skipped",
      "dataset": "tm.xpt",
      "domain": "TM",
      "variables": [],
      "message": "Rule skipped - doesn't apply to domain for rule id=CDISC.SDTMIG.CG0546, dataset=TM",
      "errors": []
    }
  ]
}

I believe this change should not make the engine through and exception. The rule compares the SM.MIDSTYPE value with TM.MIDSTYPE. If I am misunderstanding here something please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request to change 'within' to list instead of single string in target_is_sorted_by operator (Rule CG0546)

5 participants