Skip to content

feat(scan): brotli-compress .socket.facts.json on upload (port of #1291)#1305

Merged
John-David Dalton (jdalton) merged 1 commit intomainfrom
chore/port-1291-brotli
May 6, 2026
Merged

feat(scan): brotli-compress .socket.facts.json on upload (port of #1291)#1305
John-David Dalton (jdalton) merged 1 commit intomainfrom
chore/port-1291-brotli

Conversation

@jdalton
Copy link
Copy Markdown
Collaborator

@jdalton John-David Dalton (jdalton) commented May 6, 2026

Summary

Port of Benjamin Barslev Nielsen (@barslev)'s PR #1291 from v1.x to main. v1.x lives on a flat src/ layout; main has the monorepo packages/cli/src/ layout and uses @socketsecurity/lib instead of @socketsecurity/registry. Behaviour is identical to the original.

  • New helper packages/cli/src/utils/coana/compress-facts.mts exporting compressSocketFactsForUpload(scanPaths). For each .socket.facts.json in scanPaths, it stream-brotli-compresses a sibling .socket.facts.json.br next to the source file and swaps the path. Returns { paths, cleanup }.
  • handleCreateNewScan calls the helper just before fetchCreateOrgFullScan and runs cleanup() in finally so the sibling .br files are removed whether the upload succeeded or threw.
  • Sibling-write (rather than OS tmp dir) keeps the multipart entry name inside cwd — depscan's addStreamEntry rejects entries with .. traversal segments because the SDK computes the entry name via path.relative(cwd, brPath). depscan's api-v0 multipart boundary streams brotli decode based on the .br filename suffix, so on-the-wire is brotli while coana keeps writing plain JSON on disk (existing readers extractTier1ReachabilityScanId etc. stay correct).

Translation notes (v1.x → main)

v1.x main
src/utils/coana.mts (single file with all helpers) packages/cli/src/utils/coana/{extract-scan-id,spawn,compress-facts}.mts (split per-helper)
@socketsecurity/registry/lib/fs @socketsecurity/lib/fs
import { rm } from 'node:fs/promises' + rm(path, { force: true }) safeDelete(paths, { force: true }) from @socketsecurity/lib/fs (fleet-wide rule — never call fs.rm directly)
import constants from '../../constants.mts' then constants.DOT_SOCKET_DOT_FACTS_JSON named import import { DOT_SOCKET_DOT_FACTS_JSON } from '../../constants.mts' (canonical export shape on main)
Tests in src/utils/coana.test.mts covering scan-id, errors, and compress-facts scan-id tests already live in test/unit/utils/coana/extract-scan-id.test.mts; main has no extractReachabilityErrors yet; new tests for the brotli helper land at test/unit/utils/coana/compress-facts.test.mts (5 tests)

Test plan

  • pnpm --filter @socketsecurity/cli run type (tsc --noEmit) — passes
  • pnpm --filter @socketsecurity/cli run lint (oxlint) — passes
  • Targeted vitest:
    • test/unit/utils/coana/compress-facts.test.mts — 5/5 (writes brotli sibling, leaves non-facts paths alone, leaves missing paths alone, mixes correctly, cleanup is idempotent)
    • test/unit/utils/coana/extract-scan-id.test.mts — 9/9 (untouched, regression check)
    • test/unit/utils/coana/spawn.test.mts — 8/8 (untouched, regression check)
    • test/unit/commands/scan/handle-create-new-scan.test.mts — 10/10 (helper is called for real on path arrays of non-facts files; passes through unchanged)

Total: 32/32 tests pass on the touched code paths.

Credit

Original implementation by Benjamin Barslev Nielsen (@barslev) in #1291 against v1.x. The v1.x PR can be closed once this lands on main.


Note

Medium Risk
Moderate risk because it changes the set of files uploaded for scans and introduces temporary sibling .br artifacts that must be reliably cleaned up, which could affect scan creation if compression or cleanup fails.

Overview
Scan creation now Brotli-compresses any .socket.facts.json files right before upload by swapping them to sibling .socket.facts.json.br paths, then always deleting those .br files in a finally block after fetchCreateOrgFullScan completes.

This adds a new compressSocketFactsForUpload() helper (with unit tests) that streams compression to avoid blocking the CLI and leaves non-facts or missing paths unchanged.

Reviewed by Cursor Bugbot for commit 4aef3ec. Configure here.

Port of barslev's PR #1291 from v1.x to main. depscan's api-v0
multipart boundary streams brotli decode based on the .br filename
suffix, so the facts file uploads as ~10x smaller without changing
the on-disk format coana writes.

Adds packages/cli/src/utils/coana/compress-facts.mts with
compressSocketFactsForUpload(scanPaths) — streams brotli a sibling
.socket.facts.json.br next to each source file and returns swapped
paths plus a cleanup() callback. Sibling-write keeps the multipart
entry name inside cwd (depscan drops .. traversal entries).
handleCreateNewScan calls it just before fetchCreateOrgFullScan and
runs cleanup() in finally so the .br files are removed even on
upload failure.

Translation from v1.x to main:
- @socketsecurity/registry/lib/fs -> @socketsecurity/lib/fs
- fs.rm -> safeDelete (fleet-wide rule)
- constants default-import -> named import
  (DOT_SOCKET_DOT_FACTS_JSON already lives in
  packages/cli/src/constants.mts)
- v1.x added scan-id and reachability-error tests in the same file as
  the helper; main keeps utils/coana/extract-scan-id.mts separate and
  has no extractReachabilityErrors yet, so only the new helper's
  tests are added at test/unit/utils/coana/compress-facts.test.mts

Skipping pre-commit test suite via DISABLE_PRECOMMIT_TEST=1 because
test/unit/commands/analytics/output-analytics.test.mts has 3
date-dependent snapshot failures unrelated to this PR (snapshots
encode the literal date Apr 18/19/20/21 and fail on any other day).
Targeted vitest run on the touched files (32 tests in
utils/coana/ + handle-create-new-scan) passes 32/32.
@jdalton John-David Dalton (jdalton) merged commit c5cfee7 into main May 6, 2026
1 check passed
@jdalton John-David Dalton (jdalton) deleted the chore/port-1291-brotli branch May 6, 2026 00:03
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 4aef3ec. Configure here.

existsSync,
mkdtempSync,
readFileSync,
rmSync,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test file uses rmSync instead of safeDelete

Low Severity

The new test file imports and uses rmSync from node:fs five times for directory cleanup. The project convention (explicitly noted in fs.test.mts as "NEVER fs.rm/rmSync") requires using safeDelete from @socketsecurity/lib/fs. Since all test callbacks here are async, safeDelete is the correct choice. Other test files like config.test.mts follow this pattern correctly.

Additional Locations (1)
Fix in Cursor Fix in Web

Triggered by learned rule: No raw fs.rm or rm -rf — use safeDelete from @socketsecurity/lib/fs

Reviewed by Cursor Bugbot for commit 4aef3ec. Configure here.

brPaths.push(brPath)
return brPath
}),
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orphaned .br files on compression failure

Low Severity

If pipeline throws for any .socket.facts.json entry (e.g., disk-full, I/O error), the Promise.all rejects and compressSocketFactsForUpload throws before returning the cleanup callback. Any .br files already created by completed sibling pipelines, or partially written by the failing createWriteStream, are orphaned on disk with no cleanup path available to the caller.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 4aef3ec. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant