Skip to content

Fix sp_nrt_ldf_dimensional_data_postprocessing: stop early-RETURN guard from misclassifying filtered ldf_uids, harmonize INNER→LEFT JOIN#839

Open
nullflux wants to merge 2 commits into
mainfrom
aw/app-471/bug-7
Open

Fix sp_nrt_ldf_dimensional_data_postprocessing: stop early-RETURN guard from misclassifying filtered ldf_uids, harmonize INNER→LEFT JOIN#839
nullflux wants to merge 2 commits into
mainfrom
aw/app-471/bug-7

Conversation

@nullflux
Copy link
Copy Markdown
Contributor

@nullflux nullflux commented May 20, 2026

Description

sp_nrt_ldf_dimensional_data_postprocessing never populated LDF_DIMENSIONAL_DATA for fixtures with valid LDF answers. Two independent defects, both fixed here:

  1. Early-RETURN guard misclassification (lines 136-158). @backfill_list was computed against #LDF_META_DATA, which had already been filtered by a data_type IN ('ST','CV','LIST_ST') whitelist. Any ldf_uid with a data_type outside the whitelist (e.g. SUB) showed up as "missing" from the metadata set, tripped the guard, and aborted the batch. Guard now reads nrt_odse_state_defined_field_metadata directly so it only fires for genuinely missing source rows.

  2. INNER vs LEFT JOIN inconsistency (line 648). The #LDF_DATA step joined nrt_srte_LDF_PAGE_SET with INNER JOIN, but the metadata step at line 108 uses LEFT JOIN on the same relationship. Since nrt_ldf_data.ldf_page_id can legitimately be NULL, the INNER side was silently dropping rows. Harmonized to LEFT JOIN.

Verified locally: pre-fix the SP exited at step 2 logging "Missing NRT Record" with LDF_DIMENSIONAL_DATA at 0; post-fix it runs to SP_COMPLETE, step 25 inserts 5 rows, and the table has 5 rows.

Related Issue

APP-471

Additional Notes

A testData/unit fixture is included (isolated 97xxxxxx UID namespace) that seeds three ldf_uids — a SUB, a CV, and an ST — and asserts the two non-SUB rows land in LDF_DIMENSIONAL_DATA while SUB is filtered cleanly through the no-longer-overzealous guard.

Unblocks the per-condition LDF tables (LDF_TETANUS, LDF_HEPATITIS, etc.) for downstream datamart SPs. The unguarded SUBSTRING that fires when those tables are still empty is handled separately in #840

Checklist

  • I have ensured that the pull request is of a manageable size, allowing it to be reviewed within a single session.
  • I have reviewed my changes to ensure they are clear, concise, and well-documented.
  • I have updated the documentation, if applicable.
  • I have added or updated test cases to cover my changes, if applicable.

nullflux added 2 commits May 17, 2026 19:13
…LDF_DIMENSIONAL_DATA

The early-RETURN guard at lines 136-158 computed @backfill_list against
#LDF_META_DATA, conflating data_type-whitelist filtering with truly missing
NRT metadata rows. Any ldf_uid with data_type NOT IN ('ST','CV','LIST_ST')
caused the entire batch to abort. Guard now checks
nrt_odse_state_defined_field_metadata directly so it only fires for genuinely
missing source rows. Also harmonized the #LDF_DATA step (line 648) to use
LEFT JOIN on nrt_srte_LDF_PAGE_SET, matching the #LDF_META_DATA step at
line 108; nrt_ldf_data.ldf_page_id can legitimately be NULL and the prior
INNER JOIN silently dropped those rows.
DataDrivenUnitTests-compatible test that seeds nrt_ldf_data with three
ldf_uids - one SUB (data_type whitelist filtered) plus a CV and ST - then
EXECs sp_nrt_ldf_dimensional_data_postprocessing and asserts
LDF_DIMENSIONAL_DATA receives the two valid rows. Pre-fix, the SUB row
triggered the early-RETURN guard and LDF_DIMENSIONAL_DATA stayed empty.
Uses an isolated 97xxxxxx UID namespace to avoid colliding with fixture or
baseline rows.
@nullflux nullflux requested a review from a team as a code owner May 20, 2026 16:51
nullflux added a commit that referenced this pull request May 20, 2026
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
nullflux added a commit that referenced this pull request May 20, 2026
…PR routines in orchestrator

Two changes:

1. Extend bug #5b's patient_id-NULL fix to the 10 multi-condition
   nrt_investigation variants (22000010-22000100) and the Tetanus
   variant (22000200). The original cherry-pick (bb7cec9) only set
   patient_id on the foundation Inv (20000100); the Tier 3 variants
   left it NULL, so the same DELETE-by-sentinel-PATIENT_UID cascade
   that blocks HEPATITIS_DATAMART would block any condition-datamart
   SP whose query path goes through F_PAGE_CASE -> D_PATIENT lookup.

2. New orchestrator step 2.5 apply_pending_pr_routines pulls each
   pending-PR routine (PRs #837, #839, #840) from its bug branch via
   `git show` and applies it to the freshly-reset DB before fixtures.
   Without this the orchestrator aborts at step 8 on
   ldf_answers_tetanus.sql (bug #7 early-RETURN guard fires on a SUB
   data_type ldf_uid, and bug #8 SUBSTRING fires on the tetanus
   datamart SP). Remove a branch from the list once its PR merges and
   the baseline image is refreshed.

End-to-end coverage uplift vs. coverage_tier_3.md baseline:
  HEPATITIS_DATAMART         0 -> 1    (5b)
  COVID_CASE_DATAMART        0 -> 1    (5b on 22000070)
  BMIRD_STREP_PNEUMO_DATAMART 0 -> 1   (5b on 22000100)
  F_PAGE_CASE                1 -> 6    (multi-condition + 5b)
  LDF_DIMENSIONAL_DATA       0 -> 5    (PR #839 bug-7)
  MORB_RPT_USER_COMMENT      0 -> 1    (PR #837 bug-3)

Still 0 (separate per-condition blockers, not 5b-related): HEPATITIS_CASE,
HEP100, F_STD_PAGE_CASE, STD_HIV_DATAMART, TB_DATAMART, VAR_DATAMART,
LDF_TETANUS, LDF_HEPATITIS — see coverage_hep_datamart_investigation.md
for per-condition diagnoses.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants