Skip to content

fix(section16): preserve null vs 0 for empty numeric leaves on Forms 3/4/5 (#116 follow-up)#117

Merged
sroussey merged 5 commits into
mainfrom
claude/wonderful-hypatia-iDxau
May 28, 2026
Merged

fix(section16): preserve null vs 0 for empty numeric leaves on Forms 3/4/5 (#116 follow-up)#117
sroussey merged 5 commits into
mainfrom
claude/wonderful-hypatia-iDxau

Conversation

@sroussey
Copy link
Copy Markdown
Contributor

The bug

EDGAR ownership filings (Forms 3, 4, 5, and their /A amendments) routinely emit empty XML elements for numeric fields the filer chose not to populate — e.g. <transactionShares><value/></transactionShares> or <transactionPricePerShare/>. After XML parsing these arrive as { value: "" }.

OwnershipDocument.schema.ts typed the inner value as Type.Number(). TypeBox's Value.Convert (run before our storage helpers) coerces "" to 0, so the empty element becomes a fabricated zero indistinguishable from a real reported zero by the time it reaches the storage layer's num() helper. That zero then gets written to the section16_transactions / section16_holdings numeric columns, silently corrupting:

  • transaction share counts and prices,
  • post-transaction shares/value owned,
  • derivative conversion/exercise prices,
  • underlying security shares/value.

PR #116 fixed exactly this bug for Form 144 by typing numeric leaves as Type.String() and letting storage coerce ""null. The same fix was not backported to the shared Section 16 schema. This PR does that backport.

The fix

  1. OwnershipDocument.schema.ts — change the inner value field of the shared VALUE_NUMBER wrapper from Type.Optional(Type.Number()) to Type.Optional(Type.String()), with a comment mirroring Form_144.schema.ts explaining why. This single change covers all 8 affected leaves:

    • transactionAmounts.transactionShares
    • transactionAmounts.transactionPricePerShare
    • postTransactionAmounts.sharesOwnedFollowingTransaction
    • postTransactionAmounts.valueOwnedFollowingTransaction
    • underlyingSecurity.underlyingSecurityShares
    • underlyingSecurity.underlyingSecurityValue
    • derivativeTransaction.conversionOrExercisePrice
    • derivativeHolding.conversionOrExercisePrice
  2. OwnershipDocument.storage.ts — narrow num()'s parameter type to { value?: string } | string | undefined and drop the now-dead typeof v === "number" branch. Behavior is identical: string-numeric still parses; "" still maps to null.

  3. extractor_version bumped from "1.0.0" to "1.1.0".

  4. Five regression tests in OwnershipDocument.storage.test.ts, mirroring the Form 144 pattern: load a real fixture, mutate one numeric leaf's .value to "", run storage, assert the corresponding DB column is null (not 0).

Migration strategy

Re-extraction via the extractor_version bump. The dispatch/extractor-runs machinery already triggers re-extraction whenever the recorded extractor version is behind the code's declared version. processOwnershipForm calls section16Repo.clearTransactions(accession_number) / clearHoldings(accession_number) before writing, so re-running is idempotent and naturally heals the fabricated-zero rows in place.

This covers Forms 3, 4, 5, 3/A, 4/A, 5/A — every form that flows through processOwnershipForm.

Why not a SQL migration?

A UPDATE section16_transactions SET shares = NULL WHERE shares = 0 (and similar for the other 7 columns) is unsafe and was rejected: legitimate zero values exist in the wild (e.g. a $0 acquisition price for a gift transaction, a position closed to exactly zero shares following a transaction). After this corruption ran in production, there is no way at the SQL layer to distinguish a fabricated-from-empty 0 from a genuine 0 — only re-extraction from the original XML can.

Downstream consumers

No === 0 sentinel consumers were found in this repo for any of the 8 affected columns. Out-of-repo downstream consumers that interpret these columns should switch from === 0 / IS 0 checks to IS NULL for the "not reported" case, since post-fix the column is properly NULL rather than a fabricated 0.

What this PR explicitly does not touch

  • Section16Schema.ts — DB columns are already nullable; no schema change needed.
  • Section16Repo.ts — repo just stores what storage hands it.
  • Form_144.* — already correct (this PR is the backport).
  • Other extractors — unaffected.

https://claude.ai/code/session_013keqifgtvU1cKdyvoNf1ua


Generated by Claude Code

sroussey added 5 commits May 28, 2026 01:18
…n OwnershipDocument.storage.ts

- Consolidated and reordered import statements for better readability.
- Adjusted the extractor version to align with recent changes.
- Improved formatting of function parameters for clarity.
- Ensured consistent handling of transaction and holding saving logic.
@sroussey sroussey merged commit 5cda068 into main May 28, 2026
@sroussey sroussey deleted the claude/wonderful-hypatia-iDxau branch May 28, 2026 16:22
@sroussey sroussey restored the claude/wonderful-hypatia-iDxau branch May 28, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant