1017 ddf00102 #1469

RakeshBobba03 · 2025-12-04T23:55:22Z

Fixes Issue #1017 (CORERULES-9645): Updated is_column_of_iterables to check all non-null values instead of only the first row. When the first row is None (common after grouped distinct operations with LEFT joins), the method now correctly identifies columns containing iterables by examining all non-null values. Also fixed boolean column reading in Excel files by using nullable boolean dtype ("boolean") and adding true_values/false_values parameters to handle NaN values. This fix ensures containment operators work correctly with columns containing None values and affects USDM rules using grouped distinct operations with containment operators.

…ly first row

…ed distinct operations

gerrycampion

two updates. i will leave it to richard to confirm things are working for him

gerrycampion · 2025-12-05T15:14:09Z

cdisc_rules_engine/check_operators/dataframe_operators.py

+        non_null_values = []
+        for val in column:
+            if val is None:
+                continue
+            try:
+                if pd.isna(val):
+                    continue
+            except (ValueError, TypeError):
+                pass
+            non_null_values.append(val)


i believe you can replace this block with:
non_null_values = column[column.notnull()]

it will be useful to take a tutorial to learn some pandas fundamentals

gerrycampion · 2025-12-05T15:19:58Z

tests/unit/test_check_operators/test_containment_checks.py

+@pytest.mark.parametrize(
+    "column_data,expected",
+    [
+        ([["A", "B"], ["C", "D"], ["E", "F"]], True),
+        ([{"A", "B"}, {"C", "D"}], True),
+        ([None, ["A", "B"], ["C", "D"]], True),
+        ([["A", "B"], None, ["C", "D"]], True),
+        ([["A", "B"], ["C", "D"], None], True),
+        ([None, None, ["A", "B"]], True),
+        ([None, {"A", "B"}, {"C", "D"}], True),
+        ([[]], True),
+        ([set()], True),
+        ([None, []], True),
+        ([None, set()], True),
+        ([["A"], []], True),
+        ([["A"], set()], True),
+        ([{"A"}, []], True),
+        ([["A"]], True),
+        ([{"A"}], True),
+        ([None, ["A"]], True),
+        ([["A"], {"B"}], True),
+        ([None, ["A"], {"B"}], True),
+        ([float("nan")], False),
+        ([None, float("nan")], False),
+        ([None, float("nan"), ["A"]], True),
+        ([float("nan"), ["A"]], True),
+        # Negative cases - not iterables
+        (["A", "B", "C"], False),
+        ([None, "A", "B"], False),
+        ([["A", "B"], "C", ["D", "E"]], False),
+        ([None, None, None], False),
+        ([], False),
+        ([("A", "B")], False),
+        ([None, ("A", "B")], False),
+        ([["A"], ("B", "C")], False),
+    ],
+)
+def test_is_column_of_iterables(column_data, expected):
+    df = PandasDataset.from_dict({"col": column_data})
+    dataframe_operator = DataframeType({"value": df})
+    result = dataframe_operator.is_column_of_iterables(df["col"])
+    assert result == expected
+
+
+def test_is_column_of_iterables_with_pd_na():
+    """Test pd.NA handling (pandas NA value)"""
+    import pandas as pd
+
+    test_cases = [
+        ([pd.NA], False),
+        ([None, pd.NA], False),
+        ([None, pd.NA, ["A"]], True),
+        ([pd.NA, ["A"]], True),
+        ([None, pd.NA, ["A"], ["B"]], True),
+    ]


Sorry, one more thing. I think these test cases should be combined to a single function

ASL-rmarshall

General code review looks good. Validation produced expected results.

…test cases to a single function

RakeshBobba03 and others added 8 commits November 20, 2025 18:19

Fix is_column_of_iterables to check all values instead of just first row

01db764

Unit test update

755ea5e

Merge branch 'main' into 1017-DDF00102

660bc3c

Documentation Update

524ec6a

Merge branch 'main' into 1017-DDF00102

02c9c28

Merge branch 'main' into 1017-DDF00102

d5b4e61

Fix is_column_of_iterables to check all non-null values instead of on…

81bbe1e

…ly first row

Merge branch 'main' into 1017-DDF00102

164ec8c

RakeshBobba03 temporarily deployed to DEV December 4, 2025 23:55 — with GitHub Actions Inactive

RakeshBobba03 linked an issue Dec 5, 2025 that may be closed by this pull request

CORERULES-9645 - is_column_of_iterables method only checks first row #1017

Closed

RakeshBobba03 requested review from RamilCDISC, SFJohnson24 and gerrycampion December 5, 2025 00:07

RakeshBobba03 marked this pull request as ready for review December 5, 2025 00:07

Fix is_column_of_iterables to correctly handle None values from group…

bfaf0e0

…ed distinct operations

RakeshBobba03 temporarily deployed to DEV December 5, 2025 01:56 — with GitHub Actions Inactive

RakeshBobba03 requested a review from ASL-rmarshall December 5, 2025 14:26

gerrycampion requested changes Dec 5, 2025

View reviewed changes

ASL-rmarshall approved these changes Dec 5, 2025

View reviewed changes

Switch to column[column.notna()] instead of manual loop and combined …

021a81f

…test cases to a single function

RakeshBobba03 temporarily deployed to DEV December 5, 2025 16:10 — with GitHub Actions Inactive

RakeshBobba03 requested a review from gerrycampion December 5, 2025 16:10

gerrycampion approved these changes Dec 5, 2025

View reviewed changes

gerrycampion merged commit 29bc10b into main Dec 5, 2025
11 checks passed

gerrycampion deleted the 1017-DDF00102 branch December 5, 2025 16:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1017 ddf00102 #1469

1017 ddf00102 #1469

Uh oh!

RakeshBobba03 commented Dec 4, 2025 •

edited

Loading

Uh oh!

gerrycampion left a comment •

edited

Loading

Uh oh!

gerrycampion Dec 5, 2025

Uh oh!

gerrycampion Dec 5, 2025

Uh oh!

ASL-rmarshall left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

1017 ddf00102 #1469

1017 ddf00102 #1469

Uh oh!

Conversation

RakeshBobba03 commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerrycampion left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gerrycampion Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

gerrycampion Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

ASL-rmarshall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

RakeshBobba03 commented Dec 4, 2025 •

edited

Loading

gerrycampion left a comment •

edited

Loading