Conversation
789 out of ~22K IDs were being truncated from 7 to 6 digits
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| -- 1) Merge with geocoding results and create a unique ID | ||
| WITH hny AS ( | ||
| SELECT | ||
| a.project_id || '/' || coalesce(lpad(a.building_id, 6, '0'), '') AS hny_id, |
There was a problem hiding this comment.
Is there a big reason to not just lpad to 7 digits? Or not lpad at all and have this not be a fixed length field?
There was a problem hiding this comment.
I should've checked whether the lpad is even needed at all, I just assumed it was here because the other keys have historically been 6 digits and Housing didn't seem to see any issues other than the 7-to-6 problem
There was a problem hiding this comment.
@fvankrieken these are the input and output lengths of building_id values in the subquery. looks like the lpad is needed. Sam's described the "underlying HPD data" that they're trying to join to with their corrections as having 6 and 7 digit buildings IDs
| source_id_length | subquery_id_length | count |
|---|---|---|
| 3 | 6 | 2 |
| 4 | 6 | 41 |
| 5 | 6 | 402 |
| 6 | 6 | 6907 |
| 7 | 7 | 788 |
wish the queries weren't so hard to untangle so I could point and even add tests for those assumptions. shall refactor soon!
resolves #2213
all builds on this branch
checks
used the new query in dbeaver and basically looked at records in the modified subquery
where hny_id_old_length != hny_id_new_length. there are 789 such records and they all went from 12 to 13 charactersalso used in-development table comparison tool on the
devdb_hny_lookuptable that's created by the modified SQL script. it shows the expected diffs in thehny_idcolumns❯ python3 -m dcpy lifecycle scripts compare_build_tables compare db-devdb dm_fix_hpd_ids devdb_hny_lookup 11:30:21 INFO:dcpy:Comparing dm_fix_hpd_ids.devdb_hny_lookup (left) to nightly_qa.devdb_hny_lookup (right) ... ________________________________________________________________________________ tables left: devdb_hny_lookup right: devdb_hny_lookup ________________________________________________________________________________ row_count left: 8361 right: 8438 ________________________________________________________________________________ column_comparison both all_hny_units classa_hnyaff hny_id hny_jobrelate job_number left_only: None right_only: None type_differences: None ________________________________________________________________________________ data_comparison compared_columns all_hny_units classa_hnyaff hny_id hny_jobrelate job_number ignored_columns: None columns_coerced_to_numeric: None left_only 1319 rows. First 20 shown classa_hnyaff job_number hny_id hny_jobrelate all_hny_units row_hash match_count dev_count prod_count 0 0 321596803 Multiple one-to-many 2 a60df2018094a1bca0399aee1cdf8446 1 1 0 1 1 S00555943 69582/1015477 one-to-one 1 ad681317e627cdb03f016013d4b078bb 1 1 0 2 1 S00570356 69582/1015478 one-to-one 1 ea0245559eb5c4e25c9cc0423d588985 1 1 0 3 1 S00570405 69582/1015476 one-to-one 1 b6d374fedd7d7a6058a08c208281cef2 1 1 0 4 1 S00570422 69582/1015474 one-to-one 1 777aa93345f2405e0692431d3e631472 1 1 0 5 1 S00570509 69582/1015480 one-to-one 1 3b1a8545d89de1bbece37ccc15537683 1 1 0 6 1 S00570523 69582/1015483 one-to-one 1 5dc4e75124f220ad1943bee276025eda 1 1 0 7 10 220152242 74169/1005039 one-to-one 10 81e177b4b600e6df2c0d519635a92cc8 1 1 0 8 10 220640251 74578/1010365 one-to-one 32 c605f74b9798a769d83f00381725f2a3 1 1 0 9 10 220672742 73246/1009578 one-to-one 31 c8efaca5f67ad1fa50f80c0e4c8f65ec 1 1 0 10 10 321590578 70913/1017549 one-to-one 10 fcb3d9ecda54d75577228b9a5f3116ef 1 1 0 11 10 321594388 73698/1007968 one-to-one 33 dfe80edfdc9abff1cdaa7aa0c4d123fe 1 1 0 12 10 321600772 68222/1005398 one-to-one 10 9c596befa1d5b10247410e512c4ca4dc 1 1 0 13 10 321954364 74772/1013466 one-to-one 51 a89fedaf46887c90b20cdfed2d6f912b 1 1 0 14 10 321995917 73763/1005140 one-to-one 33 f480fe1832149f7ba27b8f0416c68666 1 1 0 15 10 340754945 71939/1006281 one-to-one 32 b65ef2d80655173280cf5014a2245492 1 1 0 16 10 421133026 70072/1014128 one-to-one 35 a6a945a8e5f9993b307264eb6901d49b 1 1 0 17 100 X00696576 76752/1016054 one-to-one 100 d3b54554ddb5a6b45b565013a88b5d71 1 1 0 18 101 X00554868 74615/1009388 one-to-one 101 b268b3889264f01718207db06378f067 1 1 0 19 103 321592763 69611/1017598 one-to-one 103 0dbb18152dcd538865b87edc6fd5d71c 1 1 0 right_only 1396 rows. First 20 shown classa_hnyaff job_number hny_id hny_jobrelate all_hny_units row_hash match_count dev_count prod_count 0 0 321596778 Multiple many-to-many 2 7ed5528245d9d25cfba8f6811ad7ff48 1 0 1 1 1 S00555943 69582/101547 many-to-one 1 2d86f0739ad1b395db0e9e423d10e552 1 0 1 2 10 220152242 74169/100503 one-to-one 10 4f3494eaa68a9a92637fadebe7209e03 1 0 1 3 10 220640251 74578/101036 one-to-one 32 088c19fee3aec359291da5fb50a38eb0 1 0 1 4 10 220672742 73246/100957 one-to-one 31 dea67baa2570865726422dad2954d140 1 0 1 5 10 321590578 70913/101754 one-to-one 10 3061093328357794ad9908985b581d58 1 0 1 6 10 321594388 73698/100796 one-to-one 33 3ac3d69b2b80c92d509fed15fa93809b 1 0 1 7 10 321600772 68222/100539 many-to-one 10 aa6ff47071d02982d15f18a14a3413d9 1 0 1 8 10 321954364 74772/101346 one-to-one 51 4c189b2f028854c0e0cf64b3235935a5 1 0 1 9 10 321995917 73763/100514 one-to-one 33 40e02cb9618e21f778d98c506b6e2cfb 1 0 1 10 10 340754945 71939/100628 one-to-one 32 d180d27f48576c6e35dfc5ee601ac513 1 0 1 11 10 421133026 70072/101412 one-to-one 35 4a5e6cc748419f1dc94cf38f981b207b 1 0 1 12 10 421249615 73211/100912 one-to-one 31 301f22b55dee5ae71983d359f422ba0d 1 0 1 13 100 X00696576 76752/101605 one-to-one 100 16fe78ec07f238f4ac376385b419b9d6 1 0 1 14 101 X00554868 74615/100938 one-to-one 101 6eb9584e81fa73008f3fd85951f26b34 1 0 1 15 103 321592763 69611/101759 one-to-one 103 0ef2c0a41f5b13b19908483d492cf719 1 0 1 16 105 210182309 69565/100458 one-to-one 105 3bb8c988df8b5991c746b8caaf86b4df 1 0 1 17 106 Q08012514 72051/100874 one-to-one 106 eb41e81cc137c063d608eb3961656b96 1 0 1 18 108 421133151 70115/100285 one-to-one 614 e475d7d70718124d71070d702f6a9e82 1 0 1 19 109 B00568816 74457/101012 one-to-one 109 d300152d134742d045eb60c8794a6a48 1 0 1 are_equal: False