Skip to content

#391 sz_explorer: strip null/empty entries from UNMAPPED_DATA#392

Open
jbutcher21 wants to merge 3 commits into
mainfrom
issue-391-null-mappable-attributes
Open

#391 sz_explorer: strip null/empty entries from UNMAPPED_DATA#392
jbutcher21 wants to merge 3 commits into
mainfrom
issue-391-null-mappable-attributes

Conversation

@jbutcher21
Copy link
Copy Markdown
Contributor

Closes #391.

Summary

  • Mappable attributes whose source value was null/empty (e.g. DATE: null, STATUS: "", NAME_MIDDLE: "", nested ADDR_CITY: null) were returned by the engine inside UNMAPPED_DATA and rendered as noise in sz_explorer's "Additional Data" column.
  • Fix lives at the SDK-wrapper boundary: EdaSdkWrapper.call_sdk now passes every parsed engine response through a recursive cleaner before PII masking, call-cache append, and return. One choke point covers get_entity, compare_entities, the call-cache export, and any future consumer.
  • Rules: drop null and "", keep 0 / False / other non-null values, recurse into nested dicts and lists, drop nested containers that collapse to empty.

Test plan

  • sz_explorerget 17 (Darla Anderson): DATE: None and STATUS: no longer appear; AMOUNT: 0 still does; nested FEATURES shows only non-null fields.
  • sz_explorercompare <id1> <id2> exercises the second call site cleanly.
  • export call cache shows the cleaned response (consistent debugging output).
  • make lint passes (pylint, mypy, black, flake8, isort, bandit).

@jbutcher21 jbutcher21 requested a review from a team as a code owner May 18, 2026 23:25
@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

I'll review the PR diff against the checklist now.

Code Review

Code Quality

✅ No commented-out code — Clean diff, no leftover debug code.

✅ Meaningful variable names (mostly)⚠️ Minor: cv and ci in _strip_null_values (lines ~1558, ~1568) are abbreviated. cleaned_value / cleaned_item would be clearer. k is acceptable for a short dict loop.

✅ DRY principle followed — The two functions have clean separation of concerns: _strip_null_values handles recursive null-stripping, _clean_unmapped_data handles walking the response tree. No duplication.

✅ No defects found — Logic traced through the following edge cases:

  • 0 and False are correctly preserved (checks use is None, not falsy)
  • return cleaned or None correctly collapses empty containers
  • Mutation of node["UNMAPPED_DATA"] happens before the for v in node.values() loop — no concurrent modification
  • If UNMAPPED_DATA is None, it safely becomes {} via cleaned if cleaned else {}

⚠️ Missing type annotationssz_tools/sz_explorer lines ~1546, ~1576: Neither new function has type hints. The project uses gradual typing per CLAUDE.md. Suggested signatures:

def _strip_null_values(value: Any) -> Any:
def _clean_unmapped_data(node: Any) -> None:

⚠️ Multi-line docstrings — The docstrings span multiple paragraphs. Verify this is consistent with the rest of sz_explorer's function documentation style. If most functions use single-line docstrings, these should be trimmed.


Testing

❌ No tests added — Two new module-level functions (_strip_null_values, _clean_unmapped_data) have no corresponding unit tests. These functions have non-obvious edge-case behavior (collapsing empty containers, preserving 0/False, handling None at various levels) that would benefit from direct unit tests. Key cases to cover:

  • {"key": None}{}
  • {"key": ""}{}
  • {"key": 0}{"key": 0} (zero preserved)
  • {"key": False}{"key": False} (False preserved)
  • Nested collapse: {"a": {"b": None}}{}
  • Mixed: {"a": 1, "b": None, "c": ""}{"a": 1}
  • UNMAPPED_DATA nested inside a list of records

Documentation

✅ CHANGELOG.md updated — Entry is clear, correctly placed under [Unreleased], and references the issue number (#391).

✅ Docstrings present — Both functions document their non-obvious contracts (return None on collapse, mutates in place).


Security

✅ No hardcoded credentials

✅ No sensitive data in logs

✅ No license files (.lic)

✅ Input validation — The functions defensively handle all JSON-compatible Python types via isinstance checks.


Summary

Area Status
Logic correctness
No commented-out code
Type hints ⚠️ Missing
Variable names ⚠️ Minor (cv, ci)
Unit tests ❌ None added
CHANGELOG
Security

The core fix looks correct. The main gap is the absence of unit tests for the two new utility functions given their edge-case-sensitive behavior.

Automated code review analyzing defects and coding standards

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

PR Code Review: v0.0.41 – sz_explorer UNMAPPED_DATA null/empty strip


Code Quality

✅ Code style followed
Line length, naming, and formatting appear consistent with the project's Black/isort configuration.

✅ No commented-out code
No commented-out code is present.

✅ Meaningful variable names
_strip_null_values, _clean_unmapped_data, cleaned, cv, ci are all clear and appropriately scoped.

❌ DRY principle – minor issue
_strip_null_values (sz_tools/sz_explorer, ~line 1546) duplicates iteration logic between the dict and list branches. The inner pattern — skip None/"", recurse, skip if result is None, accumulate — is identical. This is minor and acceptable at this scale, but worth noting.

❌ Logic defect – _strip_null_values on non-container scalars collapses to None when value is 0, False, or empty dict/list at leaf
For a scalar that is neither dict nor list, the function returns the value unchanged — correct. However, consider the case where UNMAPPED_DATA itself is a scalar (e.g., a plain string): _strip_null_values("some_string") returns "some_string", and _clean_unmapped_data would then assign it back unchanged. That path is fine.

More importantly: if a nested dict under UNMAPPED_DATA contains {"key": {}} (an empty dict leaf), _strip_null_values({}) returns None (via cleaned or None). This means a pre-existing empty dict value gets silently dropped — intentional per the docstring ("collapses to empty"), but worth calling out explicitly since it changes existing data shape beyond just null/empty-string trimming.

✅ No obvious bugs, crashes, race conditions, or security vulnerabilities
The recursion is bounded by the engine response structure. No external input is executed. No credentials or sensitive data touched.

✅ CLAUDE.md is appropriate
The project CLAUDE.md is general-purpose and contains no environment-specific local paths or secrets.


Testing

❌ No unit tests added for _strip_null_values or _clean_unmapped_data
These are pure functions with well-defined edge cases (nested dicts, lists, 0/False preservation, empty-dict collapse). They're easy to unit test and cover meaningful behavior. No tests appear in the diff.

❌ Edge case not covered: UNMAPPED_DATA as a non-dict type
If UNMAPPED_DATA is already a string or integer (malformed response), _strip_null_values is still called on it and returns it unchanged, then cleaned if cleaned else {} coerces a non-empty string to a {} if falsy or leaves it as-is if truthy. The cleaned if cleaned else {} guard in _clean_unmapped_data only protects against None/empty-container returns — a truthy non-dict scalar would be assigned back without issue, but the behavior is implicit rather than explicit.


Documentation

✅ CHANGELOG.md updated
Entry is present and correctly formatted under [0.0.41] - 2026-05-18.

❌ CHANGELOG.md formatting – minor
The [Unreleased] section header remains with no content between it and [0.0.41]. Per the Keep a Changelog convention, a blank [Unreleased] section is fine, but the diff shows the ## [0.0.41] block was placed below the ## [Unreleased] line without adding any content to Unreleased — this is correct and standard, just confirming it's intentional.

✅ Inline docstrings present
Both _strip_null_values and _clean_unmapped_data have descriptive docstrings explaining behavior, including the non-obvious None return convention. This is appropriate given the subtle collapse semantics.

✅ README: no changes needed
This is an internal behavioral fix, not a user-facing API change.


Security

✅ No hardcoded credentials
✅ No sensitive data in logs
✅ No .lic files or AQAAAD-prefixed strings
✅ Input validation: appropriate
_clean_unmapped_data guards on isinstance before recursing — safe.


Summary

Area Status
Code style
DRY ⚠️ Minor (acceptable)
Defects ⚠️ Empty-dict collapse semantics are implicit
Unit tests ❌ Missing for new pure functions
Edge case (non-dict UNMAPPED_DATA) ⚠️ Unguarded but unlikely in practice
CHANGELOG
Docs
Security

Recommend: Add unit tests for _strip_null_values and _clean_unmapped_data covering at minimum: nested dicts with nulls, lists with nulls, preservation of 0/False, and the empty-container-collapse-to-None behavior. Everything else is clean.

Automated code review analyzing defects and coding standards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sz_explorer: Mappable attributes with null values are considered unmapped data

2 participants