Skip to content

Conversation

@landonshumway-ia
Copy link
Collaborator

@landonshumway-ia landonshumway-ia commented Oct 30, 2025

We had an issue in beta where a state uploaded many license records, all under the same SSN. This created thousands of license update records under the same licensee, which broken several parts of the system as it was not designed to handle this case of the same licensee being updated thousands of time with a single license upload.

To prevent this from occurring in the future, this adds additional logic in our CSV uploads by checking for duplicates SSN in the file as we process them and not processing any records that duplicate SSN that were already processed. In addition, the JSON API has been updated to check all the records in the request for any duplicates, and rejecting the entire request.

We will also update our technical documentation to notify states that we do not accept repeated SSNs within the same CSV file.

Closes #1181

Summary by CodeRabbit

  • Documentation

    • Clarified upload guidance: SSNs must be unique within a single CSV file upload (duplicates in-file are rejected after the first occurrence) and within a single JSON request payload.
  • New Features

    • Duplicate-SSN detection: uploads containing the same SSN more than once in the same file or request are rejected with a validation error/400 response. The check can be toggled via a feature flag.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 30, 2025

Walkthrough

Adds a FeatureFlagEnum and migrates string flag checks to it; introduces a feature-flag–gated duplicate-SSN uniqueness check for CSV and JSON license uploads; updates docs, tests, feature-flag stack, and AppStack environment variable composition. Minor enum and default-update-category renames included.

Changes

Cohort / File(s) Summary
Documentation
backend/compact-connect/docs/README.md, backend/compact-connect/docs/it_staff_onboarding_instructions.md
Documented SSN uniqueness rules: CSV uploads must not contain duplicate SSNs within a single file; JSON batch requests must not contain duplicate SSNs within the same payload.
Feature flag enum
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py
Added FeatureFlagEnum (StrEnum) with TEST_FLAG, ENCUMBRANCE_MULTI_CATEGORY_FLAG, and DUPLICATE_SSN_UPLOAD_CHECK_FLAG.
Feature flag client
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py
is_feature_enabled signature updated to accept FeatureFlagEnum; added logging around checks.
Flag usage updates
backend/compact-connect/lambdas/python/common/cc_common/data_model/..., backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py, backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
Replaced string-literal flag checks with FeatureFlagEnum members.
UpdateCategory rename
backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/common.py
Removed UpdateCategory.OTHER; added UpdateCategory.LICENSE_UPLOAD_UPDATE_OTHER = 'other'.
CSV duplicate-SSN check (bulk_upload)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py
Added module-level cached flag and per-file SSN tracker; when enabled, raises ValidationError on duplicate SSNs within the same CSV upload.
JSON duplicate-SSN check (licenses)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py
Added module-level cached duplicate_ssn_check_flag_enabled and runtime validation to reject request payloads containing duplicate SSNs (raises 400 / CCInvalidRequestCustomResponseException).
Ingest handler default change
backend/compact-connect/lambdas/python/provider-data-v1/handlers/ingest.py
Default update_type changed from UpdateCategory.OTHER to UpdateCategory.LICENSE_UPLOAD_UPDATE_OTHER.
Tests (feature flag client)
backend/compact-connect/lambdas/python/common/tests/unit/test_feature_flag_client.py
Tests updated to use FeatureFlagEnum.TEST_FLAG instead of literal strings.
Tests (bulk upload & licenses)
backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py, .../test_licenses.py
Added MagicMock-based patching of is_feature_enabled and tests asserting duplicates are rejected; duplicate test method definitions appear in both files in this diff.
Feature flag stack
backend/compact-connect/stacks/feature_flag_stack/__init__.py
Added duplicate_ssn_upload_check_flag feature flag resource (auto-enabled for TEST, BETA, PROD).
AppStack env vars
backend/common-cdk/common_constructs/stack.py
Conditionally add API_BASE_URL to common_env_vars when API domain name is present.
CI workflow
.github/workflows/check-compact-connect.yml
Pin pip to <25.3 before running dependency sync in workflow steps.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor Client
    participant Handler
    participant FeatureFlagClient
    participant Validator
    participant Storage

    Client->>Handler: Submit license batch (CSV or JSON)
    Handler->>FeatureFlagClient: is_feature_enabled(DUPLICATE_SSN_UPLOAD_CHECK_FLAG)
    alt Flag enabled
        Handler->>Validator: Extract SSNs from file/request
        Validator->>Validator: Check for duplicates within payload
        alt Duplicate found
            Validator-->>Handler: Validation error (indices/details)
            Handler-->>Client: 400 validation error
        else No duplicates
            Handler->>Storage: Process batch
            Handler-->>Client: 200/201 success
        end
    else Flag disabled
        Handler->>Storage: Process batch (no duplicate check)
        Handler-->>Client: 200/201 success
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review focus:
    • Lifecycle and concurrency assumptions for module-level cached flags and in-memory ssns_in_file_upload in bulk_upload.py.
    • Duplicate test method insertions in test_bulk_upload.py and test_licenses.py.
    • All call sites and mocks for is_feature_enabled updated to accept FeatureFlagEnum.
    • Ensure no remaining references to UpdateCategory.OTHER.

Possibly related PRs

Suggested reviewers

  • jlkravitz
  • jusdino
  • isabeleliassen

Poem

🐰 I hopped through rows with whiskers bright,

One SSN each, kept tidy and tight.
Flags now guard the upload gate,
Duplicates stopped — batches stay straight.
A rabbit cheers: neat uploads tonight!

Pre-merge checks and finishing touches

❌ Failed checks (2 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description clearly explains the problem, solution for CSV and JSON uploads, and mentions documentation updates. However, it does not follow the provided template structure with Requirements/Description/Testing sections. Consider restructuring the description to match the repository template with explicit Requirements, Description, and Testing sections for consistency.
Out of Scope Changes check ❓ Inconclusive The PR includes an out-of-scope feature flag infrastructure refactoring (migrating from string-based to enum-based feature flags) that, while related to the SSN validation feature flag, extends beyond the core duplicate SSN prevention requirements. Clarify whether the feature flag refactoring (FeatureFlagEnum introduction and migration across multiple files) is a prerequisite for the duplicate SSN check or should be separated into a distinct PR.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Feat/prevent repeated ssns' directly relates to the main objective of preventing duplicate SSN uploads in both CSV and JSON APIs as described in the PR.
Linked Issues check ✅ Passed The PR implements all coding requirements from issue #1181: CSV duplicate SSN detection with per-file tracking, JSON API batch validation rejecting entire requests, and documentation updates are all present in the changes.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95b71cb and 7698b35.

📒 Files selected for processing (2)
  • backend/compact-connect/docs/README.md (1 hunks)
  • backend/compact-connect/docs/it_staff_onboarding_instructions.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/compact-connect/docs/README.md
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-15T19:56:58.050Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1135
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/privilege/record.py:60-0
Timestamp: 2025-10-15T19:56:58.050Z
Learning: In the CompactConnect codebase, when migrating from a single field to a list field (e.g., clinicalPrivilegeActionCategory to clinicalPrivilegeActionCategories), both fields are intentionally kept as optional during the migration period to support backwards compatibility. Mutual exclusivity validation is not enforced during this phase, as the deprecated field will be removed in a follow-up PR with migration scripts.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
📚 Learning: 2025-09-04T17:56:36.542Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1058
File: backend/compact-connect/docs/it_staff_onboarding_instructions.md:203-216
Timestamp: 2025-09-04T17:56:36.542Z
Learning: In the CompactConnect CSV upload process, the form field name "content-type" (lowercase) works successfully in practice, despite theoretical concerns about S3 POST policy case sensitivity. The user landonshumway-ia confirmed successful local testing with the current documentation format.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: TestApp
  • GitHub Check: TestApp
🔇 Additional comments (1)
backend/compact-connect/docs/it_staff_onboarding_instructions.md (1)

243-244: Documentation clearly explains both CSV and JSON duplicate SSN handling.

The two new constraints effectively answer the previous reviewer's question about expected behavior. Grammar has been corrected from earlier feedback, and the distinction between CSV (first row accepted, duplicates rejected) and JSON (entire batch rejected) is clear and accurate.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@landonshumway-ia landonshumway-ia marked this pull request as ready for review October 30, 2025 17:58
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (1)

147-169: Avoid O(n²) duplicate SSN scan. With ~200k-record uploads, scanning a growing list every time gives ~20 B string comparisons and will blow our ingest latency. Track first-seen rows in a dict (or set + index map) to stay O(n).

-    ssns_in_file_upload = []
+    ssns_in_file_upload: dict[str, int] = {}-                    if duplicate_ssn_check_flag_enabled:
-                        for index, record_ssn in enumerate(ssns_in_file_upload):
-                            if license_ssn == record_ssn:
-                                raise ValidationError(
-                                    message=f'Duplicate License SSN detected. SSN matches with record {index + 1}. '
-                                    f'Every record must have a unique SSN within the same file.'
-                                )
-                        ssns_in_file_upload.append(license_ssn)
+                    if duplicate_ssn_check_flag_enabled:
+                        first_record_index = ssns_in_file_upload.get(license_ssn)
+                        if first_record_index is not None:
+                            raise ValidationError(
+                                message=(
+                                    'Duplicate License SSN detected. '
+                                    f'SSN matches with record {first_record_index}. '
+                                    'Every record must have a unique SSN within the same file.'
+                                )
+                            )
+                        ssns_in_file_upload[license_ssn] = i + 1
🧹 Nitpick comments (8)
backend/compact-connect/docs/README.md (1)

34-34: Fix list indentation to satisfy markdownlint (MD007).

Align this bullet with the list style (or de-indent to 0).

Apply:

-   - SSNs must be unique within a single CSV upload file. Do not include multiple rows with the same `ssn` in one file
+ - SSNs must be unique within a single CSV upload file. Do not include multiple rows with the same `ssn` in one file
backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py (2)

106-106: Import enum from its canonical module for clarity.

Avoid pulling FeatureFlagEnum via feature_flag_client.

-from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled
+from cc_common.feature_flag_enum import FeatureFlagEnum
+from cc_common.feature_flag_client import is_feature_enabled

108-121: Normalize clinicalPrivilegeActionCategories to enum values.

When the list form is provided, ensure items are ClinicalPrivilegeActionCategory instances.

-        else:
-            adverse_action.clinicalPrivilegeActionCategories = adverse_action_request[
-                'clinicalPrivilegeActionCategories'
-            ]
+        else:
+            adverse_action.clinicalPrivilegeActionCategories = [
+                ClinicalPrivilegeActionCategory(v)
+                for v in adverse_action_request['clinicalPrivilegeActionCategories']
+            ]

This keeps types consistent with the single-value branch and downstream consumers. Based on learnings.

backend/compact-connect/docs/it_staff_onboarding_instructions.md (1)

243-245: Clarify JSON failure mode (optional).

Consider stating the response code for duplicate SSNs to aid integrators.

- - For JSON uploads, SSNs must be unique within a single request payload (array). Do not include duplicate `ssn` values in the same batch
+ - For JSON uploads, SSNs must be unique within a single request payload (array). Duplicate `ssn` values in the same batch will result in HTTP 400 (Bad Request).
backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py (1)

1431-1432: Use canonical enum import to avoid cross-module coupling.

Prefer importing FeatureFlagEnum from cc_common.feature_flag_enum.

-from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled
+from cc_common.feature_flag_enum import FeatureFlagEnum
+from cc_common.feature_flag_client import is_feature_enabled

Also applies to: 2677-2678

backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py (1)

378-381: Import enum from cc_common.feature_flag_enum for consistency.

Keeps enum definitions centralized and avoids re-export coupling.

-from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled
+from cc_common.feature_flag_enum import FeatureFlagEnum
+from cc_common.feature_flag_client import is_feature_enabled
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (2)

13-19: Enum import + flag caching considerations.

  • Import FeatureFlagEnum from cc_common.feature_flag_enum for consistency.
  • Module-level caching won’t reflect runtime toggles. If you expect toggling, consider a small TTL or per-request check (still fail-open).
-from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled  # noqa: E402
+from cc_common.feature_flag_enum import FeatureFlagEnum  # noqa: E402
+from cc_common.feature_flag_client import is_feature_enabled  # noqa: E402

Optional TTL pattern:

# simple TTL cache
_flag_cache = {'val': None, 'ts': None}
def _dup_ssn_check_enabled() -> bool:
    now = config.current_standard_datetime
    if not _flag_cache['ts'] or (now - _flag_cache['ts']).seconds > 300:
        _flag_cache['val'] = is_feature_enabled(FeatureFlagEnum.DUPLICATE_SSN_UPLOAD_CHECK_FLAG, fail_default=True)
        _flag_cache['ts'] = now
    return bool(_flag_cache['val'])

And guard with: if _dup_ssn_check_enabled():


74-86: Return actionable duplicate-SSN errors (indices and values).

Provide which SSNs are duplicated and where; keep error shape predictable.

-    if duplicate_ssn_check_flag_enabled:
-        # verify that none of the SSNs are repeats within the same batch
-        license_ssns = [license_record['ssn'] for license_record in licenses]
-        if len(set(license_ssns)) < len(license_ssns):
-            raise CCInvalidRequestCustomResponseException(
-                response_body={
-                    'message': 'Invalid license records in request. See errors for more detail.',
-                    'errors': {
-                        'SSN': 'Same SSN detected on multiple rows. '
-                        'Every record must have a unique SSN within the same request.'
-                    },
-                }
-            )
+    if duplicate_ssn_check_flag_enabled:
+        # verify that none of the SSNs are repeats within the same batch
+        ssn_to_indices: dict[str, list[int]] = {}
+        for i, rec in enumerate(licenses):
+            ssn_to_indices.setdefault(rec['ssn'], []).append(i)
+        duplicate_map = {ssn: idxs for ssn, idxs in ssn_to_indices.items() if len(idxs) > 1}
+        if duplicate_map:
+            raise CCInvalidRequestCustomResponseException(
+                response_body={
+                    'message': 'Invalid license records in request. See errors for more detail.',
+                    'errors': {'duplicateSsns': duplicate_map},
+                }
+            )

This helps clients locate and fix offending rows and keeps errors as structured data.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b51f243 and c3b53ad.

📒 Files selected for processing (15)
  • backend/compact-connect/docs/README.md (1 hunks)
  • backend/compact-connect/docs/it_staff_onboarding_instructions.md (1 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py (2 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py (1 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/common.py (1 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (2 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1 hunks)
  • backend/compact-connect/lambdas/python/common/tests/unit/test_feature_flag_client.py (13 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (4 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py (1 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/ingest.py (1 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (2 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py (2 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py (2 hunks)
  • backend/compact-connect/stacks/feature_flag_stack/__init__.py (1 hunks)
🧰 Additional context used
🧠 Learnings (16)
📚 Learning: 2025-10-28T15:05:49.315Z
Learnt from: jusdino
PR: csg-org/CompactConnect#1167
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py:1698-1701
Timestamp: 2025-10-28T15:05:49.315Z
Learning: In the CompactConnect Python codebase, enums such as InvestigationStatusEnum, InvestigationAgainstEnum, ActiveInactiveStatus, LicenseEncumberedStatusEnum, and PrivilegeEncumberedStatusEnum derive from StrEnum. When writing DynamoDB AttributeValues (e.g., {'S': ...}) in files like backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py, pass the enum member directly (without .value). Future reviews should not recommend using .value for these enums.

Applied to files:

  • backend/compact-connect/lambdas/python/common/tests/unit/test_feature_flag_client.py
  • backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-09-12T14:16:08.280Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1083
File: backend/compact-connect/bin/generate_mock_license_csv_upload_file.py:71-73
Timestamp: 2025-09-12T14:16:08.280Z
Learning: In backend/compact-connect/bin/generate_mock_license_csv_upload_file.py, the function parameters ssn_prefix and all_active intentionally do not have default values because the script ensures values are always passed from the top-level CLI argument parsing. Since this is a CLI-only script, defaults are not needed for the internal functions.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py
  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py
  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py
📚 Learning: 2025-09-12T14:14:53.986Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1083
File: backend/compact-connect/bin/generate_mock_license_csv_upload_file.py:39-41
Timestamp: 2025-09-12T14:14:53.986Z
Learning: In backend/compact-connect/bin/generate_mock_license_csv_upload_file.py, the functions are intentionally designed for CLI-only usage and should not be called programmatically. The code is meant to fail fast if developers try to import and use these functions outside of the CLI context, rather than having defensive programming guards.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py
  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py
📚 Learning: 2025-04-29T21:14:36.205Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#769
File: backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_encumbrance.py:38-40
Timestamp: 2025-04-29T21:14:36.205Z
Learning: In the CompactConnect codebase, `cc_common.config._Config.current_standard_datetime` is a property rather than a method, and should be patched in tests by directly providing a datetime value: `patch('cc_common.config._Config.current_standard_datetime', datetime.fromisoformat(DEFAULT_DATE_OF_UPDATE_TIMESTAMP))`.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py
📚 Learning: 2025-04-29T21:14:36.205Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#769
File: backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_encumbrance.py:38-40
Timestamp: 2025-04-29T21:14:36.205Z
Learning: In the CompactConnect codebase, `cc_common.config._Config.current_standard_datetime` is a property (not a method), and can be properly patched in tests by directly supplying a datetime value rather than wrapping it in a lambda: `patch('cc_common.config._Config.current_standard_datetime', datetime.fromisoformat(DEFAULT_DATE_OF_UPDATE_TIMESTAMP))`.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py
📚 Learning: 2025-04-29T21:14:36.205Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#769
File: backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_encumbrance.py:38-40
Timestamp: 2025-04-29T21:14:36.205Z
Learning: In the CompactConnect codebase, `cc_common.config._Config.current_standard_datetime` is implemented as a property that returns `datetime.now(tz=UTC).replace(microsecond=0)`. The correct way to patch it in tests is by directly providing the datetime value, not wrapping it in a lambda: `patch('cc_common.config._Config.current_standard_datetime', datetime.fromisoformat(DEFAULT_DATE_OF_UPDATE_TIMESTAMP))`.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py
📚 Learning: 2025-09-11T20:06:40.709Z
Learnt from: ChiefStief
PR: csg-org/CompactConnect#1036
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py:370-383
Timestamp: 2025-09-11T20:06:40.709Z
Learning: The EncumbranceDetailsSchema in backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/privilege/record.py does not contain a 'note' field. It only has clinicalPrivilegeActionCategory (String, optional), adverseActionId (UUID, required), and licenseJurisdiction (Jurisdiction, optional). When working with encumbrance notes, use encumbranceDetails['clinicalPrivilegeActionCategory'], not encumbranceDetails['note'].

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-09-11T20:06:40.709Z
Learnt from: ChiefStief
PR: csg-org/CompactConnect#1036
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py:370-383
Timestamp: 2025-09-11T20:06:40.709Z
Learning: The EncumbranceDetailsSchema in backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/privilege/record.py does not contain a 'note' field. It only has clinicalPrivilegeActionCategory (String, optional), adverseActionId (UUID, required), and licenseJurisdiction (Jurisdiction, optional).

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-10-15T19:53:00.422Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1135
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py:2642-2654
Timestamp: 2025-10-15T19:53:00.422Z
Learning: In backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py and provider_record_util.py, the get_adverse_action_by_id method correctly handles UUID vs string comparison as written. Attempting to add explicit type conversion causes the mapping to fail. The current implementation should not be changed to add UUID/string conversion logic.

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-09-03T22:29:24.535Z
Learnt from: ChiefStief
PR: csg-org/CompactConnect#1036
File: backend/compact-connect/lambdas/python/data-events/handlers/encumbrance_events.py:0-0
Timestamp: 2025-09-03T22:29:24.535Z
Learning: The EncumbranceEventDetailSchema in backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/data_event/api.py is used across multiple instances/contexts where adverseActionCategory and adverseActionId fields may be required in some cases and not present in others. This is an intentional design pattern for handling different event variants with a single schema.

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-10-15T19:56:58.050Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1135
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/privilege/record.py:60-0
Timestamp: 2025-10-15T19:56:58.050Z
Learning: In the CompactConnect codebase, when migrating from a single field to a list field (e.g., clinicalPrivilegeActionCategory to clinicalPrivilegeActionCategories), both fields are intentionally kept as optional during the migration period to support backwards compatibility. Mutual exclusivity validation is not enforced during this phase, as the deprecated field will be removed in a follow-up PR with migration scripts.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
📚 Learning: 2025-09-04T17:56:36.542Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1058
File: backend/compact-connect/docs/it_staff_onboarding_instructions.md:203-216
Timestamp: 2025-09-04T17:56:36.542Z
Learning: In the CompactConnect CSV upload process, the form field name "content-type" (lowercase) works successfully in practice, despite theoretical concerns about S3 POST policy case sensitivity. The user landonshumway-ia confirmed successful local testing with the current documentation format.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
📚 Learning: 2025-08-19T19:39:47.790Z
Learnt from: jsandoval81
PR: csg-org/CompactConnect#1019
File: webroot/src/network/licenseApi/data.api.ts:256-276
Timestamp: 2025-08-19T19:39:47.790Z
Learning: The backend API for CompactConnect will accept and require the new `encumbranceType` field in encumbrance requests, even though this field may not appear in the current (incomplete) API documentation.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
📚 Learning: 2025-09-03T22:35:42.943Z
Learnt from: ChiefStief
PR: csg-org/CompactConnect#1036
File: backend/compact-connect/lambdas/python/data-events/handlers/encumbrance_events.py:200-204
Timestamp: 2025-09-03T22:35:42.943Z
Learning: The adverseActionCategory and adverseActionId fields in encumbrance events are only needed when the event flow creates privilege update database records. Some privilege encumbrance event publications don't create database records and don't need these fields, which is why they're optional in the EncumbranceEventDetailSchema.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py
📚 Learning: 2025-05-30T13:48:25.375Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#824
File: backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py:116-201
Timestamp: 2025-05-30T13:48:25.375Z
Learning: In encumbrance handling code, prefer to keep privilege and license encumbrance lifting implementations decoupled rather than extracting shared logic, as requirements between these implementations are likely to change in the future.

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
📚 Learning: 2025-08-13T20:28:33.191Z
Learnt from: rmolinares
PR: csg-org/CompactConnect#1011
File: webroot/src/models/Licensee/Licensee.model.ts:276-279
Timestamp: 2025-08-13T20:28:33.191Z
Learning: The two-year encumbrance wait period feature only applies to privileges, not licenses. The logic should only check if privilege encumbrances that are no longer active have lift dates within the past two years.

Applied to files:

  • backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py
🧬 Code graph analysis (11)
backend/compact-connect/lambdas/python/common/tests/unit/test_feature_flag_client.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/ingest.py (1)
backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/common.py (1)
  • UpdateCategory (287-301)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py (2)
backend/compact-connect/lambdas/python/common/common_test/sign_request.py (1)
  • sign_request (20-70)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (1)
  • post_licenses (25-106)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/data_model/provider_record_util.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (4)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/provider-data-v1/license_csv_reader.py (1)
  • licenses (12-20)
backend/compact-connect/lambdas/python/common/cc_common/exceptions.py (1)
  • CCInvalidRequestCustomResponseException (11-17)
backend/compact-connect/stacks/feature_flag_stack/__init__.py (1)
backend/compact-connect/stacks/feature_flag_stack/feature_flag_resource.py (2)
  • FeatureFlagResource (23-72)
  • FeatureFlagEnvironmentName (15-20)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/encumbrance.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/common/cc_common/data_model/data_client.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py (3)
backend/compact-connect/lambdas/python/provider-data-v1/tests/function/__init__.py (1)
  • TstFunction (22-436)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (1)
  • parse_bulk_upload_file (64-120)
backend/compact-connect/lambdas/python/common/cc_common/event_batch_writer.py (1)
  • put_event (42-49)
🪛 LanguageTool
backend/compact-connect/docs/it_staff_onboarding_instructions.md

[grammar] ~243-~243: Ensure spelling is correct
Context: ...ultiple rows with the same ssn in one upload - For JSON uploads, SSNs must be unique wi...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)

🪛 markdownlint-cli2 (0.18.1)
backend/compact-connect/docs/README.md

34-34: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: TestApp
🔇 Additional comments (10)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/ingest.py (1)

300-301: Enum rename looks good; behavior unchanged.

Defaulting to LICENSE_UPLOAD_UPDATE_OTHER preserves the stored value and clarifies scope.

Please confirm downstream filters/analytics don’t rely on the old enum member name (string value appears unchanged).

backend/compact-connect/lambdas/python/common/tests/unit/test_feature_flag_client.py (1)

3-4: Enum-based flag usage in tests: LGTM.

Solid coverage across success/failure paths; URL assertion validates enum stringification.

Also applies to: 18-18, 40-40, 56-56, 76-76, 86-86, 100-100, 114-114, 129-129, 144-144, 159-159, 173-173, 225-225

backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)

47-75: Enum-typed flag checks look good. Switching to FeatureFlagEnum keeps call sites on the curated catalog—nice tightening.

backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_licenses.py (1)

372-400: Thanks for covering the duplicate SSN path. This regression test proves the new rejection shape and secures the feature flagged behavior.

backend/compact-connect/stacks/feature_flag_stack/__init__.py (1)

127-139: New flag resource wiring LGTM. Auto-enabling the duplicate-SSN gate across envs keeps rollout aligned with the runtime check.

backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/common.py (1)

299-301: No action required – enum rename is complete and safe.

The rename from UpdateCategory.OTHER to UpdateCategory.LICENSE_UPLOAD_UPDATE_OTHER has already been fully applied. Only one reference to this enum value exists in the codebase (ingest.py:300), and it already uses the new name. No references to the old name remain, confirming all downstream callers have been updated.

backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)

1-15: LGTM! Clean feature flag enumeration.

The implementation provides a centralized, type-safe catalog for feature flags. The docstring clearly explains the lifecycle, and the enum structure will help prevent typos and improve IDE support across the codebase.

backend/compact-connect/lambdas/python/provider-data-v1/tests/function/test_handlers/test_bulk_upload.py (3)

3-3: LGTM! Appropriate test imports.

Adding MagicMock and patch to support feature flag mocking is the right approach for testing feature-gated behavior.


11-12: LGTM! Mock setup is correct.

The mock_flag_client returning True enables feature-flag-gated code paths for testing.


274-356: No duplicate test method exists.

The verification shows the test method test_bulk_upload_prevents_repeated_ssns_within_the_same_file_upload appears only once at line 274. The review comment's concern about duplicate test insertion is unfounded.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (1)

17-18: Consider moving import to top and relocating comment.

The comment explains the module-level flag caching (lines 28-30) but appears next to the import statement. The import itself could be placed with other imports at the top of the file (lines 1-14), and the comment would be clearer if moved to line 28 where the actual flag initialization occurs.

Apply this diff to clean up the organization:

 from cc_common.data_model.schema.license.api import (
     LicensePostRequestSchema,
     LicenseReportResponseSchema,
 )
 from cc_common.event_batch_writer import EventBatchWriter
 from cc_common.exceptions import CCInternalException
-
-# initialize flag outside of handler so the flag is cached for the lifecycle of the container
-from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled  # noqa: E402
+from cc_common.feature_flag_client import FeatureFlagEnum, is_feature_enabled
 from cc_common.utils import (
     ResponseEncoder,
     api_handler,

Then add the comment at line 28:

 from license_csv_reader import LicenseCSVReader
 from marshmallow import ValidationError
 
+# Initialize flag at module level so it's cached for the lifecycle of the container
 duplicate_ssn_check_flag_enabled = is_feature_enabled(
     FeatureFlagEnum.DUPLICATE_SSN_UPLOAD_CHECK_FLAG, fail_default=True
 )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9d7e82f and 5e63011.

📒 Files selected for processing (1)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (4 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-12T14:16:08.280Z
Learnt from: landonshumway-ia
PR: csg-org/CompactConnect#1083
File: backend/compact-connect/bin/generate_mock_license_csv_upload_file.py:71-73
Timestamp: 2025-09-12T14:16:08.280Z
Learning: In backend/compact-connect/bin/generate_mock_license_csv_upload_file.py, the function parameters ssn_prefix and all_active intentionally do not have default values because the script ensures values are always passed from the top-level CLI argument parsing. Since this is a CLI-only script, defaults are not needed for the internal functions.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py
🧬 Code graph analysis (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-115)
🔇 Additional comments (3)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (3)

28-30: Good choice of fail_default=True for data integrity.

Setting fail_default=True ensures that duplicate SSN checking is enabled even when the feature flag service is unavailable. This is the correct approach for a validation feature that protects data integrity, preventing the system from accepting problematic uploads when the flag service is down.


149-151: Correct placement for per-file duplicate tracking.

Initializing the SSN tracker inside process_bulk_upload_file ensures each file upload gets its own tracker instance. The tracker correctly persists across all batches within the file, enabling duplicate detection across the entire upload.


161-171: <!-- [scratchpad_start]
[task verify_ssn_normalization 3/3]
Verify findings

[observations]

  • SocialSecurityNumber field class (fields.py:26-28) enforces strict regex: ^[0-9]{3}-[0-9]{2}-[0-9]{4}$
  • This regex REQUIRES format XXX-XX-XXXX (with hyphens)
  • Validation rejects any SSN not in this exact format
  • Schema validation happens BEFORE the duplicate check at line 164-171
  • All SSNs reaching the duplicate detection are guaranteed to be in consistent format

[analysis]
Original review concern: "If SSNs can be validated in different formats like '123-45-6789' and '123456789', the duplicate detection would miss same-SSN records"

Reality: SSNs cannot pass validation in different formats. The schema enforces strict format XXX-XX-XXXX. Variations like "123456789" or "123 45 6789" will fail validation before reaching duplicate check.

Therefore:

  • No SSN normalization logic needed (schema handles it)
  • Duplicate detection logic is sound
  • Verification request in original review is moot

[pending]
None - verification complete

[done]
Schema validation enforces strict SSN format. Duplicate detection is safe. Review comment should be updated to remove verification request and confirm no issues found.
[scratchpad_end]
-->

Duplicate SSN detection logic is sound and safe.

The implementation correctly:

  • Checks for duplicates only when the feature flag is enabled
  • Raises a clear ValidationError referencing the first occurrence by record number
  • Skips tracking SSNs for records that fail earlier validation
  • Adds the SSN to the tracker only after confirming it's not a duplicate

The SocialSecurityNumber schema field enforces a strict regex pattern ^[0-9]{3}-[0-9]{2}-[0-9]{4}$ that requires the format XXX-XX-XXXX. All SSNs are validated to this consistent format before reaching the duplicate detection logic, so there's no risk of format variations bypassing the check.

Minor simplification: Line 171 can use direct dictionary assignment instead of .update():

-            ssns_in_file_upload.update({license_ssn: i + 1})
+            ssns_in_file_upload[license_ssn] = i + 1

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)

62-79: Docstring examples should match the new enum API.

The examples still pass raw strings (e.g., 'test-feature'), but is_feature_enabled now expects a FeatureFlagEnum. Updating the snippets to use FeatureFlagEnum.TEST_FLAG (and similar) will keep callers aligned with the new contract.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e63011 and 9f66c2b.

📒 Files selected for processing (2)
  • backend/common-cdk/common_constructs/stack.py (1 hunks)
  • backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-10T18:45:43.801Z
Learnt from: jsandoval81
Repo: csg-org/CompactConnect PR: 1114
File: backend/compact-connect-ui-app/stacks/frontend_deployment_stack/deployment.py:34-35
Timestamp: 2025-10-10T18:45:43.801Z
Learning: In backend/compact-connect-ui-app/stacks/frontend_deployment_stack/deployment.py, the cdk.context.deploy-example.json file is intentionally minimal and serves as a template, not for actual deployments. Actual deployment environments (test, beta, prod, sandbox) contain all required environment-specific keys like statsig_key, app_env, recaptcha_public_key, and robots_meta. This is by design.

Applied to files:

  • backend/common-cdk/common_constructs/stack.py
📚 Learning: 2025-10-09T21:11:36.590Z
Learnt from: jusdino
Repo: csg-org/CompactConnect PR: 1143
File: backend/compact-connect/tests/app/base.py:514-530
Timestamp: 2025-10-09T21:11:36.590Z
Learning: In the BackendStage class within the CompactConnect backend infrastructure, the following stacks are always required and their absence indicates a serious error: api_lambda_stack, api_stack, disaster_recovery_stack, event_listener_stack, feature_flag_stack, ingest_stack, managed_login_stack, persistent_stack, provider_users_stack, state_api_stack, state_auth_stack, and transaction_monitoring_stack. Only backup_infrastructure_stack can be None (when backups are disabled). The reporting_stack and notification_stack are conditionally present based on hosted_zone configuration.

Applied to files:

  • backend/common-cdk/common_constructs/stack.py
📚 Learning: 2025-08-18T21:59:25.029Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1014
File: backend/compact-connect/stacks/api_stack/v1_api/api.py:31-36
Timestamp: 2025-08-18T21:59:25.029Z
Learning: The CompactConnect codebase has two separate V1Api implementations: one for the internal API stack (backend/compact-connect/stacks/api_stack/v1_api/api.py) and one for the state API stack (backend/compact-connect/stacks/state_api_stack/v1_api/api.py). These are distinct classes with different constructor signatures and purposes. Changes to one V1Api implementation do not necessarily affect the other.

Applied to files:

  • backend/common-cdk/common_constructs/stack.py
🧬 Code graph analysis (1)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: TestApp
  • GitHub Check: TestApp
🔇 Additional comments (1)
backend/common-cdk/common_constructs/stack.py (1)

100-103: API_BASE_URL propagation looks good.

This ensures feature-flag clients receive the correct base URL whenever a hosted API domain is configured, while leaving local-only contexts untouched.

Copy link
Collaborator

@jusdino jusdino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@landonshumway-ia
Copy link
Collaborator Author

@jlkravitz This is ready for your review. Thanks

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (1)

74-86: Enhance error message to identify specific duplicate SSNs and record indices.

The duplicate detection logic is correct, but the error message could be more actionable. Currently it only states that duplicates exist, without identifying which SSNs are duplicated or which record indices contain them. For large batches, users need specific information to fix the issue.

Additionally, the error format is inconsistent with other validation errors in this handler—previous errors are keyed by record index (e.g., {"0": {...}, "5": {...}}), while this uses "SSN" as the key.

Consider this approach to provide more helpful error details:

 if duplicate_ssn_check_flag_enabled:
     # verify that none of the SSNs are repeats within the same batch
     license_ssns = [license_record['ssn'] for license_record in licenses]
     if len(set(license_ssns)) < len(license_ssns):
+        # Identify which SSNs are duplicated and at which indices
+        ssn_indices = {}
+        for i, ssn in enumerate(license_ssns):
+            ssn_indices.setdefault(ssn, []).append(i)
+        duplicate_details = {
+            str(i): {'ssn': [f'SSN {ssn} is duplicated at indices: {", ".join(map(str, indices))}']}
+            for ssn, indices in ssn_indices.items() if len(indices) > 1
+            for i in indices
+        }
         raise CCInvalidRequestCustomResponseException(
             response_body={
                 'message': 'Invalid license records in request. See errors for more detail.',
-                'errors': {
-                    'SSN': 'Same SSN detected on multiple rows. '
-                    'Every record must have a unique SSN within the same request.'
-                },
+                'errors': duplicate_details,
             }
         )

This matches the error format used elsewhere in the handler and provides specific guidance on which records need correction.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f66c2b and 8321f3b.

📒 Files selected for processing (1)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-12T14:16:08.280Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1083
File: backend/compact-connect/bin/generate_mock_license_csv_upload_file.py:71-73
Timestamp: 2025-09-12T14:16:08.280Z
Learning: In backend/compact-connect/bin/generate_mock_license_csv_upload_file.py, the function parameters ssn_prefix and all_active intentionally do not have default values because the script ensures values are always passed from the top-level CLI argument parsing. Since this is a CLI-only script, defaults are not needed for the internal functions.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py
🧬 Code graph analysis (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (4)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-117)
backend/compact-connect/lambdas/python/provider-data-v1/license_csv_reader.py (1)
  • licenses (12-20)
backend/compact-connect/lambdas/python/common/cc_common/exceptions.py (1)
  • CCInvalidRequestCustomResponseException (11-17)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: TestApp
  • GitHub Check: TestApp
🔇 Additional comments (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/licenses.py (1)

13-19: Verify that fail_default=True aligns with operational intent.

The feature flag is correctly cached at module scope for Lambda lifecycle performance. However, fail_default=True means the duplicate SSN check will be enforced even if the feature flag service is unavailable. While the comment indicates this is a "low risk flag," please confirm this aligns with your rollout strategy—if you need the ability to fully disable the check during an incident, fail_default=False would be safer.

Copy link
Collaborator

@jlkravitz jlkravitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few small questions!

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (2)

17-18: Clarify "execution environment" terminology.

The comment references "lambda execution environment" which is accurate AWS Lambda terminology. However, per past review discussion, consider whether "execution environment" alone would be clearer for readers less familiar with AWS Lambda internals.

Apply this diff if you prefer the more general terminology:

-# initialize flag outside of handler so the flag is cached for the lifecycle of the lambda execution environment
+# initialize flag outside of handler so the flag is cached for the lifecycle of the execution environment

161-171: Consider explicit None check for matched_ssn_index.

The truthiness check on line 165 works correctly since record indices start at 1, but it relies on implicit behavior. An explicit is not None check would be more robust and self-documenting.

Apply this diff:

                     license_ssn = validated_license['ssn']
                     if duplicate_ssn_check_flag_enabled:
                         matched_ssn_index = ssns_in_file_upload.get(license_ssn)
-                        if matched_ssn_index:
+                        if matched_ssn_index is not None:
                                 raise ValidationError(
                                     message=f'Duplicate License SSN detected. SSN matches with record '
                                             f'{matched_ssn_index}. Every record must have a unique SSN within the same '
                                             f'file.'
                                 )
                         ssns_in_file_upload.update({license_ssn: i + 1})
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8321f3b and 1b3badf.

📒 Files selected for processing (3)
  • backend/compact-connect/docs/README.md (1 hunks)
  • backend/compact-connect/docs/it_staff_onboarding_instructions.md (1 hunks)
  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (4 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-15T19:56:58.050Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1135
File: backend/compact-connect/lambdas/python/common/cc_common/data_model/schema/privilege/record.py:60-0
Timestamp: 2025-10-15T19:56:58.050Z
Learning: In the CompactConnect codebase, when migrating from a single field to a list field (e.g., clinicalPrivilegeActionCategory to clinicalPrivilegeActionCategories), both fields are intentionally kept as optional during the migration period to support backwards compatibility. Mutual exclusivity validation is not enforced during this phase, as the deprecated field will be removed in a follow-up PR with migration scripts.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
📚 Learning: 2025-09-04T17:56:36.542Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1058
File: backend/compact-connect/docs/it_staff_onboarding_instructions.md:203-216
Timestamp: 2025-09-04T17:56:36.542Z
Learning: In the CompactConnect CSV upload process, the form field name "content-type" (lowercase) works successfully in practice, despite theoretical concerns about S3 POST policy case sensitivity. The user landonshumway-ia confirmed successful local testing with the current documentation format.

Applied to files:

  • backend/compact-connect/docs/it_staff_onboarding_instructions.md
📚 Learning: 2025-09-12T14:16:08.280Z
Learnt from: landonshumway-ia
Repo: csg-org/CompactConnect PR: 1083
File: backend/compact-connect/bin/generate_mock_license_csv_upload_file.py:71-73
Timestamp: 2025-09-12T14:16:08.280Z
Learning: In backend/compact-connect/bin/generate_mock_license_csv_upload_file.py, the function parameters ssn_prefix and all_active intentionally do not have default values because the script ensures values are always passed from the top-level CLI argument parsing. Since this is a CLI-only script, defaults are not needed for the internal functions.

Applied to files:

  • backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py
🧬 Code graph analysis (1)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (2)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_enum.py (1)
  • FeatureFlagEnum (4-15)
backend/compact-connect/lambdas/python/common/cc_common/feature_flag_client.py (1)
  • is_feature_enabled (47-117)
🪛 markdownlint-cli2 (0.18.1)
backend/compact-connect/docs/README.md

34-34: Unordered list indentation
Expected: 0; Actual: 3

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: TestApp
🔇 Additional comments (2)
backend/compact-connect/lambdas/python/provider-data-v1/handlers/bulk_upload.py (1)

28-30: Good choice of fail_default=True for duplicate SSN check.

Using fail_default=True ensures that if the feature flag service is unavailable, the duplicate SSN validation will still be enabled by default. This is the correct safety posture for a data integrity feature.

backend/compact-connect/docs/it_staff_onboarding_instructions.md (1)

243-244: Clear documentation of SSN uniqueness requirements.

The distinction between CSV behavior (first wins, others rejected) and JSON behavior (entire batch rejected) is well documented and matches the implementation. This will help states understand the different error handling approaches for each upload method.

Copy link
Collaborator

@jlkravitz jlkravitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isabeleliassen Good to merge.

@isabeleliassen isabeleliassen merged commit 137ac54 into csg-org:development Nov 3, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent license CSV batch uploads that repeat SSNs

4 participants