Skip to content

fix: address CodeQL real-bug findings (6 surgical changes)#170

Merged
williamzujkowski merged 1 commit into
mainfrom
fix/codeql-real-bugs
May 12, 2026
Merged

fix: address CodeQL real-bug findings (6 surgical changes)#170
williamzujkowski merged 1 commit into
mainfrom
fix/codeql-real-bugs

Conversation

@williamzujkowski
Copy link
Copy Markdown
Collaborator

Summary

Six targeted fixes for findings from the CodeQL scan enabled in #166. Each is independent and small. Tests added for the two with subtle correctness implications.

# File:line Finding Fix
1 apps/web/src/lib/github.ts:160 js/identity-replacement Drop the no-op .replace(/-/g, "-").
2 packages/annotator/src/annotator.ts:64,65,68 (×3) js/incomplete-sanitization (CWE-116) Extract yamlEscape() that escapes backslashes before double quotes. Previously, a caseName like X \ Y would emit a raw backslash that YAML readers interpret as an escape-sequence start. Test added.
3 packages/fetcher/src/fetcher.ts:128 js/polynomial-redos Anchor regex to literal /download/releasepoints/us/pl/.../xml_usc…zip and replace [^"]* segments with [^"/]+. Test added (40k-char malformed input returns in <100ms).
4 scripts/bulk-import.ts:53–54, 62, 66 js/insecure-temporary-file, js/file-system-race Replace predictable /tmp/usc-title-N paths with mkdtemp(tmpdir() + 'usc-title-'). Drop statSyncreadFile pair (TOCTOU); derive size from header.length.
5 scripts/lib/import-helpers.ts:157–158, 178 js/insecure-temporary-file Same temp-dir hardening for the historical-import code path.

Findings NOT addressed here

  • js/http-to-file-access (×3) — the pipeline's purpose is fetch-remote-XML → write-file. Will dismiss in CodeQL UI as "won't fix - by design" with this PR linked as context.
  • js/insecure-temporary-file in packages/pipeline/src/orchestrate.ts:208,243 — these are product output writes (statute Markdown + annotation YAML) to a configured outputDir, not temp files. CodeQL's pattern matcher misclassifies them; path traversal is already guarded by isSafePath. Will dismiss "won't fix - false positive".
  • The 12 regex-based HTML sanitization findings in github.ts (sanitizeContent/sanitizeExcerpt) — deferred to a follow-up PR (PR F) that swaps in sanitize-html library; bundling here would require both a runtime-dep addition and a golden-output regression suite that's worth its own review.

Test plan

  • pnpm install, pnpm build, pnpm test (120 fetcher + ... = 269 tests; +2 new), pnpm typecheck (clean), pnpm lint (clean) all green locally
  • CI passes on this PR
  • Post-merge CodeQL re-scan: alerts list should drop by 6+ (1 + 3 + 1 + 1 + 1, possibly more due to rule-level deduplication)

🤖 Generated with Claude Code

Six targeted fixes for findings from the just-enabled CodeQL scan
(#166). Each fix is independently verifiable; tests added for the
two with subtle correctness implications.

1. apps/web/src/lib/github.ts — drop `.replace(/-/g, "-")` from
   `formatTagName`. Identity replacement; dead code from a prior
   refactor.

2. packages/annotator/src/annotator.ts — `yamlEscape()` now escapes
   backslashes before double quotes. Previously, a `caseName` like
   `X \ Y` would emit `"X \ Y"` (raw backslash) which YAML readers
   interpret as an escape sequence start. CWE-116 (incomplete
   sanitization). Test added.

3. packages/fetcher/src/fetcher.ts — `parseReleasePoints` regex
   anchored to the literal path structure (`/download/releasepoints/
   us/pl/.../xml_usc...zip`) and constrains each segment to
   non-slash/non-quote chars. Removes the `[^"]*` prefix that gave
   CodeQL js/polynomial-redos backtracking exposure. ReDoS-resistance
   test added (40k-char malformed input must return in <100ms).

4. scripts/bulk-import.ts — replace predictable `/tmp/usc-title-N`
   paths with `mkdtemp(tmpdir() + 'usc-title-')` (random per-call dir).
   Also drops the `statSync(tmpZip)` → `readFile(tmpZip)` pair (TOCTOU
   race) since the buffer's `.length` is sufficient. Single finally
   block now handles cleanup.

5. scripts/lib/import-helpers.ts — same temp-dir hardening as #4 for
   the historical import path (`downloadAndExtractXml`).

Verified locally: pnpm build, test (118+74+... = 269 passing — two new
tests), typecheck (0 errors), lint all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@williamzujkowski williamzujkowski requested a review from a team as a code owner May 12, 2026 12:22
@williamzujkowski williamzujkowski merged commit 5395114 into main May 12, 2026
3 checks passed
@williamzujkowski williamzujkowski deleted the fix/codeql-real-bugs branch May 12, 2026 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant