Skip to content

Conversation

@gerrycampion
Copy link
Collaborator

This pull request refactors how codelist and term metadata are structured and accessed throughout the codebase, moving from a legacy "submission_lookup" dictionary format to a more standardized "codelists" list of dictionaries. This change impacts data loading, metadata extraction, attribute querying, and related tests, leading to more consistent and maintainable handling of controlled terminology data.

Refactoring metadata structure and access:

  • Replaced the use of submission_lookup dictionaries with a list of codelist dictionaries under the codelists key throughout the codebase, including in metadata containers, service responses, and tests. [1] [2] [3] [4]
  • Updated data extraction logic in build_ct_lists and build_ct_terms (in library_metadata_container.py) to iterate over the new codelists structure, extracting codes and terms directly from dictionary entries. [1] [2]

Refactoring attribute extraction and lookup logic:

  • Rewrote attribute extraction in get_codelist_attributes.py to use JSONPath queries on the new codelists structure, removing multiple legacy helper methods for extracting codes and terms.
  • Updated codelist and term lookup logic in codelist_extensible.py and codelist_terms.py to search and match within the codelists list, improving robustness and reducing reliance on indirect mappings. [1] [2] [3]

Test and documentation updates:

  • Refactored unit tests for codelist extensible and term operations to use the new metadata format, ensuring test coverage matches the new structure. [1] [2] [3] [4] [5]
  • Updated docstrings and example data in service and reader modules to reflect the new codelist metadata format. [1] [2] [3] [4] [5]

Copy link
Collaborator

@SFJohnson24 SFJohnson24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR passes high level review. It correctly resolves the issue described in the ticket with collisions in the submission lookup due to values not being unique. The PR allows the associated DDF rule to be executed, see attached:
CORE-Report-2025-12-03T15-26-33.xlsx with the correct term attached from the operator.
It also preserves extensible functionality -- DDF00210 was tested (negative attached)
CORE-Report-2025-12-03T15-50-04.xlsx
The PR replaces the submission_lookup dictionary lookup with a direct list comprehension of codelist-level submission values. It preserves the operator functionality elegantly, just changing how data is stored and retreived.

@SFJohnson24 SFJohnson24 merged commit 9236a11 into main Dec 3, 2025
11 checks passed
@SFJohnson24 SFJohnson24 deleted the 939-rule-blocked-corerules-9609---codelist_terms-operation-does-not-retrieve-correct-terms branch December 3, 2025 20:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants