Skip to content

Sentinel-encoded nulls: drop per-element bitmap#206

Merged
singaraiona merged 38 commits into
masterfrom
sentinel-migration-finish
May 18, 2026
Merged

Sentinel-encoded nulls: drop per-element bitmap#206
singaraiona merged 38 commits into
masterfrom
sentinel-migration-finish

Conversation

@singaraiona
Copy link
Copy Markdown
Collaborator

Summary

Make the type-correct sentinel in the payload the sole source of truth for null state across producers, consumers, on-disk format, and on-wire format. Decommission the per-element bitmap entirely: no inline bits, no external bitmap allocation, no ext_nullmap field, no RAY_ATTR_NULLMAP_EXT attribute, no on-disk or on-wire bitmap region. RAY_ATTR_HAS_NULLS stays as the vec-level fast-path bit.

What changed

  • Null encoding per type: F64 NaN, F32 NaN, I64/TIMESTAMP INT64_MIN, I32/DATE/TIME INT32_MIN, I16 INT16_MIN, STR len=0, GUID 16 zero bytes. BOOL / U8 / SYM are non-nullable — ray_vec_set_null_checked returns RAY_ERR_TYPE for these.
  • Single null reader: ray_vec_is_null dispatches to the type's sentinel check. No bitmap fallback.
  • CSV reader, IPC serde, col file format, splayed writer: stopped allocating, writing, or restoring the bitmap. Wire and on-disk formats shrink to type / attrs / len / data. ray_serde_size no longer adds bitmap bytes.
  • linkop / idxop: sentinel-encoded nulls don't consume the union arm, so ray_link_attach writes link_target unconditionally and ray_index_attach doesn't need to preserve a bitmap.
  • Fused expression evaluator (expr_exec_*): added missing handlers exposed once the lockdown removed the HAS_NULLS shortcut that previously rejected fused compilation for narrow integer columns (I64→I64 CAST, BOOL→I64 CAST, I64/I64 AND/OR, BOOL dt=I64 AND/OR).
  • Removed surface: ray_vec_nullmap_bytes, vec_inline_nullmap, read_nullmap_bit, ray_index_release_saved's bitmap branch, col_restore_ext_nullmap, promote_inline_to_ext, null_bitmap_size, ser_null_bitmap, de_null_bitmap, nullmap_bits_mut, RAYFORCE_NULL_AUDIT instrumentation + Makefile target, RAY_ATTR_NULLMAP_EXT, ext_nullmap / str_ext_null union field names.
  • Rewritten in place: ~14 tests across test_vec.c, test_sort.c, test_store.c, test_index.c, test_heap.c, test_link.c, test_exec.c, and several rfl tests so they assert the post-strip behavior instead of bitmap mechanics. Three rfl files renamed: bool_u8_lockdown.rfl → bool_u8_non_nullable.rfl, f64_dual_encoding.rfl → f64_nan_encoding.rfl, integer_dual_encoding.rfl → integer_sentinel_encoding.rfl.
  • Comment / doc sweep: stripped Phase-N, dual-encoding, "legacy bitmap", "sentinel-migration", and related historical narrative from every comment and doc. Deleted the now-obsolete plan / spec under docs/superpowers/.

Diff stat

73 files changed, +1420 / -2082

Test plan

  • make clean && make -j8 — clean with -Werror, ASan + UBSan
  • make test=== 2449 of 2450 passed (1 skipped, 0 failed) ===
  • ./rayforce.test -f rfl/null — 12 / 12 null-related rfl tests pass (arith, agg, cast, concat, distinct, sort, update, format, grouped_agg, bool_u8_non_nullable, f64_nan_encoding, integer_sentinel_encoding, sentinel_only_baseline)

hetoku added 30 commits May 18, 2026 11:01
Captures the end-state contract (sentinel as sole source of truth,
RAY_ATTR_HAS_NULLS retained as check-free fast-path gate, nullmap[16]
arm decommissioned), the 6-stage work plan, and the test/perf strategy.

Supersedes the in-code multi-phase plan at include/rayforce.h:309-346.
All work lands on this branch; one completion PR against master at the
end per the no-partial-state-to-master rule.
The original Stage A plan flipped ray_vec_is_null to sentinel-based on
the assumption that Phase 3a-13 had closed all producer-side dual-
encoding gaps.  First execution exposed ~40 test failures + 1 ASAN
SEGV across 9 test files and ~17 operator source files (reverted at
f8a2e9c).  The doc's "all gaps closed" claim was overstated.

Adds a debug-only consistency check guarded by -DRAYFORCE_NULL_AUDIT:
ray_vec_is_null cross-checks the bitmap answer against sentinel_is_null
and logs each divergent caller (deduped by return address, max 128
sites) to stderr with a backtrace.  Behavior unchanged in non-audit
builds; bitmap remains authoritative until Stage 2 of the revised plan.

Adds `make audit` target.  Baseline audit reports 142 unique divergent
call sites concentrated in window/sort/join/builtins/idxop/string —
captured in the design doc as the Stage 1 target list.

Updates docs/superpowers/specs/2026-05-18-sentinel-migration-finish-design.md
with the revised stage order: Stage 0 (instrumentation, this commit),
Stage 1 (producer-gap closure via make audit), Stage 2 (flip source of
truth), then the original Stages 3-6 (drop bitmap writes, remove
storage, cleanup, verify+PR).

No master-bound PR — work continues on the long-running branch.
The post-cast sentinel-fill loop walked the destination bitmap to drive
null-slot identification (ray_vec_is_null(vec, j)).  This works today
because the dest's bitmap is set by ray_vec_copy_nulls before the loop
runs, but it breaks under sentinel-as-source-of-truth: the cast loop
overwrote the dest payload before the fill loop runs, so a sentinel-
based ray_vec_is_null on the dest returns false at every slot.

Walk the SOURCE (val) instead.  Source's null state is the source of
truth for the cast, and works uniformly under either bitmap- or
sentinel-authoritative readers.  Source has the sentinel because all
upstream producers (parse.c, CSV ingest, ray_typed_null, etc.) write
both the bitmap and the sentinel per Phase 2/3a.

Also splits the LIST branch out cleanly: ray_vec_set_null already
writes both encodings per Phase 3a-4, so the LIST case returns before
the post-fill block runs.  No behavior change for LIST input.

Stage 1 producer fix #1 of N.  make audit divergence count:
142 unique callers -> 138 (the 4 cast_vec_copy_nulls sites at
builtins.c:769/775/781/787 are gone).  Full suite (non-audit):
2450/2451 passing.
Window/group parallel kernels call win_set_null = par_set_null on
result slots after writing 0 / 0.0 to the payload.  par_set_null only
atomic-OR'd the bitmap bit, leaving the payload at the cast-result
(non-sentinel) value.  Bitmap-authoritative readers see the null;
sentinel-authoritative readers see a real zero — the dual-encoding
gap the migration needs to close before flipping ray_vec_is_null.

par_set_null now first writes the type-correct NULL_* sentinel into
payload[idx], then performs the existing atomic bitmap OR.  Payload
write needs no synchronisation because parallel callers always use
distinct idx; the atomic OR remains for the bitmap (multiple slots
share a byte).  STR/SYM/BOOL/U8 stay as-is — their null encoding is
out-of-band or the type is non-nullable.

make audit divergence count: 138 unique callers -> 120.  The 18 win.c
sites (lag/lead/first_value/last_value/nth_value/running_*) close in
one fix because they all funnel through win_set_null.  Full suite
(non-audit): 2450/2451 passing.
The central nullity-marking API stamped only the bitmap bit, leaving
the payload at its prior value (typically 0 or the user-passed cell
value).  Every operator that calls ray_vec_set_null or its checked
form to mark a result-slot null (cast paths, joins, ASOF fill, sort
sentinel reorder, group-by per-(group, agg) finalization, dict
upsert, expr/strop result-fill, etc.) was producing bitmap-only nulls.

Now ray_vec_set_null_checked writes the type-correct NULL_* sentinel
into payload[idx] before performing the existing bitmap maintenance.
is_null=false still only clears the bit (caller owns the real-value
payload write).  STR continues to defer payload semantics to the
ext-alloc path because its inline-pointer-pair arm aliases
str_pool / str_ext_null.

Also fixes test_exec_asof_left_join which asserted (bid_data[0] == 0.0)
for a no-match fill slot — encoding the legacy bitmap-only behavior.
Updated to check ray_vec_is_null(bid_col, 0) instead, which works
correctly under both bitmap-as-truth and sentinel-as-truth.

make audit divergence count: 120 unique callers -> 27.  Full suite
(non-audit): 2450/2451 passing.
For STR vecs, marking a slot null now zeroes the ray_str_t element
(len=0, no pool offset, no inline data).  Sentinel-based readers
detect STR null via len == 0 per the empty-string-is-null convention.

Prior behavior left the element's len / pool_off / data intact when
marking null; only the bitmap bit changed.  Tests using the str ops
surface (concat, split, format) had bitmap-only nulls on STR
results, which my prior commit's audit caught.

Dead pool bytes from severed pool_off references are not reclaimed
here — same behavior as overwriting a long string with a short one.

make audit divergence count: 27 unique callers -> 16.
Full suite (non-audit): 2450/2451 passing.
BOOL/U8 are locked as non-nullable per Phase 1.  Several legacy tests
predate the lockdown (test_vec_null_external, test_sort_u8_nulls_*,
test_sort_bool_nulls_first) and exercise ray_vec_set_null on these
types.  The audit reports those as divergent (bitmap=1, sentinel=0)
because BOOL/U8 have no NULL_* sentinel by design.

These are not real producer gaps — they're test scenarios for an API
behavior that goes away with the migration (the bitmap itself
disappears at Stage 4).  Suppress audit on BOOL/U8 vecs so the
remaining divergence count focuses on real nullable-type gaps.

The cleanup of these tests + locking ray_vec_set_null_checked to
reject BOOL/U8 (matching the existing SYM rejection) is tracked as
a Stage 1 exit-gate task.

make audit divergence count: 16 unique callers -> 7.
GUID is 16 bytes with no committed null sentinel convention.  Tests
that exercise ray_vec_set_null on RAY_GUID vecs mark the bitmap but
sentinel_is_null returns false (no GUID case in the switch).  Suppress
to focus on real I64/F64/etc producer gaps.  Picking a GUID null
convention (likely all-zeros) is a Stage 1 exit-gate task.

make audit divergence count: 7 unique callers -> 4.
The bulk-OR fast path in propagate_nulls merged source nullmap bits
into the destination but never touched the destination payload —
leaving the cast/computed value (e.g. (double)NULL_I64 ≈ -9.22e18)
in slots the bitmap now claims are null.  Binary arithmetic, unary
ops, and the per-cast null-fill all funnel through this helper, so
the gap surfaced in test_expr_binary_null_propagation.

Adds stamp_sentinels_from_bitmap(): given the freshly OR'd dst
bitmap, walk and stamp the type-correct NULL_* into payload at each
null slot.  Called once after the bulk OR completes, so the per-byte
bitmap scan stays cache-warm.  The slow per-element path already
went through ray_vec_set_null which dual-writes, so no change there.

make audit divergence count: 4 unique callers -> 1.  The lone
remaining site is test_vec_null_inline:249 — a legacy test that
clears the bitmap bit without writing a real value, so the prior
NULL_I64 sentinel persists in the payload (correct under sentinel-
as-truth; bug in the test's bitmap-only mental model).  Stage 1 exit
task: update the test or remove it.

Full suite (non-audit): 2450/2451 passing.
The test cleared the bitmap bit at slot 3 without restoring the
payload value.  Under bitmap-as-truth this returned not-null (correct);
under sentinel-as-truth the prior NULL_I64 sentinel from set-null
persists and the sentinel reader still sees null.

Convention going into Stage 2: clear-null on a vec requires the
caller to first restore a real payload value.  ray_vec_set_null
with is_null=false touches only the bitmap (the prior value is
unknown to it).  Document this in the test.

make audit divergence count: 1 unique caller -> 0.  Stage 1 exit
gate achieved.  Full suite (non-audit): 2450/2451 passing.
ray_vec_is_null now reads the payload sentinel as source of truth for
types with a defined NULL_* sentinel (F64, I16, I32, I64, DATE, TIME,
TIMESTAMP, STR).  Types without a sentinel (BOOL, U8, GUID, F32)
retain the bitmap path — they're either non-nullable per Phase 1 or
have a deferred sentinel design.

Approach: type-switch in ray_vec_is_null with bitmap fallback for the
sentinel-less types.  Audit instrumentation cross-checks the sentinel
answer against the bitmap for the flipped types only (BOOL/U8/GUID
suppressed as before).  Producer surface for the flipped types was
audited clean in S1.1..S1.7 (zero divergent callers), so no behavioral
regression for the sentinel types.

The atom-level RAY_ATOM_IS_NULL macro stays bitmap-based for now — a
naive flip exposed two cascading design questions (empty-string-IS-null
semantic for dict find / nil? "", and a GUID-typed-null obj=NULL deref
in cmp.c's GUID compare branch) that warrant their own session.

Full suite: 2450/2451 passing.  Stage 2b (atom-level flip) pending.
User decisions:
  1. Empty string IS a null STR atom; tests that assumed distinction
     between "" and null are updated to the new semantic.
  2. NULL_GUID = 16 all-zero bytes (canonical convention).

Core changes
------------
- RAY_ATOM_IS_NULL becomes a type-dispatched inline function.  Atom
  null for F64 = NaN, I*/temporal = INT_MIN sentinel, SYM = id 0,
  STR = empty (slen==0 AND obj==NULL — long strings overlay obj on
  the union, so a non-NULL obj pointer can have a low byte that
  reads as slen=0; the AND obj==NULL guard prevents false positives),
  GUID = 16 zero bytes in obj's U8 payload.  BOOL/U8/F32 still
  bitmap-based (no sentinel committed for those types).
- ray_typed_null(-RAY_GUID) now allocates a fresh ray_guid with 16
  zero bytes (instead of leaving obj=NULL), so consumers can ray_data
  the obj unconditionally and cmp.c's GUID compare branch doesn't
  trip on a typed-null GUID's NULL obj.
- sentinel_is_null + ray_vec_set_null_checked gain RAY_GUID arms:
  read/write 16 zero bytes at payload + idx*16 for the null sentinel.
- ray_vec_is_null promotes GUID from the bitmap-fallback type list
  into the sentinel-supporting list.

Test updates (per user direction 1)
-----------------------------------
- test/test_dict.c: empty-string and all-zero-GUID lookups now find
  the null slot (index 1) rather than returning -1.  Documented
  conflation per the design doc.
- test/test_lang.c (insert_guid): typed-null GUID atom's obj is
  non-NULL post-migration; check the 16-byte payload is all zeros
  instead of asserting obj==NULL.
- test/rfl/integration/null.rfl: (nil? "") --> true.
- test/rfl/arith/sqrt.rfl: (nil? (sqrt -1.0)) --> true; (!= NaN NaN)
  --> false (cmp.c null-handling treats two nulls as equal).
- test/rfl/strop/split.rfl: split "" "," yields a one-element vector
  whose only element is null; assert via nil? rather than [0Nc]
  (parser rejects that literal form).
- test/rfl/agg/pearson_corr.rfl: undefined-result detection via nil?
  rather than self != self (same cmp.c null-handling implication).
- test/rfl/integration/fused_group_parity.rfl + test/rfl/type/as.rfl:
  INT_MIN values now round-trip as typed null (documented hazard in
  include/rayforce.h NULL_* block — sentinel collision with user-
  stored INT_MIN).
- test/rfl/system/read_csv.rfl: SYM-vec vs null-STR-atom filter
  behavior changed; documented tension for follow-up.

Full suite: 2450/2451 passing.  make audit: 0 divergences.
Mirror NULL_F64's NaN encoding for F32 (NULL_F32 = __builtin_nanf(""))
so F32 column data has the same IEEE-NaN-as-null semantics as F64.
Useful for embedding columns (RAY_F32 ML vector data) where NaN
commonly denotes missing values.

Touched:
- include/rayforce.h: NULL_F32 constant; RAY_ATOM_IS_NULL F32 case
  (reads f64 union slot and downcasts since F32 atoms reuse f64
  payload — see atom.c:82).
- src/vec/vec.c: sentinel_is_null + ray_vec_set_null_checked +
  ray_vec_is_null sentinel-list now include RAY_F32.
- src/vec/atom.c: ray_typed_null(-RAY_F32) writes NaN into the f64
  union slot.
- src/ops/internal.h: par_set_null writes NULL_F32 into payload.

make audit: 0 divergences.  Full suite: 2450/2451.
Document in ray_vec_set_null_checked why the bitmap write stays
alongside the sentinel write: ray_vec_nullmap_bytes consumers
(propagate_nulls fast path, group.c radix HT, idxop save/restore,
morsel iter, serde encode) still read the bitmap directly and must
be made sentinel-aware before the bitmap write can be stripped.
That conversion is tracked as Stage 3' (formerly Stage B in the
design doc); attempted strip without it produced 35 test failures.

Also document that BOOL/U8 lockdown (reject set-null like SYM) is
deferred to a later session — legacy tests still exercise the bitmap
API on these types and need cleanup or deletion before the API
returns RAY_ERR_TYPE.

No functional change; reformatting of the sentinel-write switch and
comment text only.  Suite stays 2450/2451, audit clean.
ray_serde IPC encoder used ray_vec_nullmap_bytes to write the wire
null bitmap.  Convert to ray_vec_is_null scan so the encoder stays
correct once the bitmap arm is reclaimed.  Per-element loop runs at
encode time only (cold path); decoder unchanged (still writes the
bitmap on the receive side, where the bitmap is still maintained).

Stage 3' converter #1 of 14.  Full suite: 2450/2451.
Three call sites in expr.c used ray_vec_nullmap_bytes via the local
nullmap_bits helper for byte-aligned bitmap scans:

- propagate_nulls (binary op null OR): bulk-OR src bitmap into dst
  + post-pass sentinel stamp.  Replaced with per-element
  ray_vec_is_null + ray_vec_set_null loop.  ray_vec_set_null dual-
  writes (sentinel + bitmap) so dst's null state stays consistent
  under both dual-encoding and bitmap-stripped end states.
- fix_null_comparisons one-sided-null fast path: byte-level bitmap
  walk with chunked skip-empty-byte.  Replaced with per-element
  sentinel scan; lost the 8-element chunk speedup but works after
  bitmap is reclaimed.  Existing slow path (two-sided null cases)
  already used ray_vec_is_null and is unchanged.
- set_all_null: kept the dst-bitmap mass-write via nullmap_bits_mut
  for now (will go in the final cleanup) but added missing F32 / STR
  / GUID sentinel-fill arms so scalar-null broadcast leaves dst in
  the dual-encoded state for every nullable type.

Removed the now-unused nullmap_bits and stamp_sentinels_from_bitmap
helpers.  nullmap_bits_mut remains until the bitmap arm is reclaimed.

Stage 3' converter #2.  Full suite: 2450/2451.
Convert the scalar reduction path (reduce_range + par_reduce_ctx)
and the count-distinct-per-group parallel path (cdpg_ctx + worker
fns) and the O(1) FIRST/LAST short-circuit to use sentinel checks
instead of raw nullmap byte reads.

Macros refactored:
- REDUCE_LOOP_I / DISPATCH_I gain a NULL_SENT parameter (the type-
  correct NULL_* literal).  Per-element bitmap byte read replaced
  with `raw == NULL_SENT`.
- REDUCE_LOOP_F null check becomes `v != v` (NaN self-equality).
- BOOL/U8 dispatch dropped from DISPATCH_I — those types have no
  sentinel.  Inlined a small loop in reduce_range that calls
  ray_vec_is_null (falls back to legacy bitmap path for these types
  until Phase 1 lockdown lands).

Contexts shrunk:
- par_reduce_ctx_t: drops null_bm field.
- cdpg_ctx_t: drops null_bm field.  cdpg_hist_fn / cdpg_scat_fn use
  a new cdpg_is_null inline helper that mirrors sentinel_is_null
  but specialised for cdpg's pre-resolved (base, in_type, esz).
  SYM null detection becomes `sym_id == 0` directly.

FIRST/LAST short-circuit at the top of the scalar reduce path now
calls ray_vec_is_null per row instead of indexing into null_bm
bytes.  Same dispatch shape, just one helper call instead of two
arithmetic ops.

Stage 3' converters #3#7 of 14.  Full suite: 2450/2451.
make audit: 0 divergences.
Convert the remaining ray_vec_nullmap_bytes consumers to type-correct
sentinel reads.  All non-self call sites are now gone — only the
declaration in vec.h and the (now-dead) implementation in vec.c
remain, ready for Stage 5 cleanup.

group.c contexts converted (null_bm field removed, init drops the
ray_vec_nullmap_bytes call, kernel reads sentinel via a type-aware
helper specialised for that path's input layout):

- count_distinct_per_group_serial (cdpg_is_null on src base/type)
- med_par_ctx_t + worker (new med_is_null helper)
- topk_par_ctx_t + worker (med_is_null reused)
- grpt_phase1_ctx_t + worker (new grpt_is_null, ray_sym_elem_size
  inline for SYM width)
- grpc_phase1_ctx_t + worker (new grpc_is_null; covers 2-key
  pearson_corr with x/y val cols)
- grpmm_phase1_ctx_t + worker (1-key pearson_corr variant; reuses
  grpc_is_null)

query.c sites:

- cdpg_buf_par_ctx_t: per-typed-loop inline sentinel check
  (NULL_I64 / NULL_I32 / NULL_I16 / NaN); BOOL/U8 (esz==1) becomes
  unconditional non-null per Phase 1 lockdown.
- ray_xbar_fn's vec-col vs vec-col path: replaces the byte-aligned
  bulk-bitmap walk with per-element ray_vec_is_null.

The byte-level fast paths trade SIMD speedups for sentinel-based
correctness once the bitmap arm is reclaimed.  Re-vectorising on top
of payload-sentinel scans is a future perf-engineering task.

Stage 3' complete (14/14 converters).  Full suite: 2450/2451.
make audit: 0 divergences.  Stage 3 (strip bitmap writes) is now
safe to land.
Attempt to strip the bitmap write side of ray_vec_set_null_checked
after Stage 3' produced 8 test failures:

  index/attach_drop_with_ext_nullmap   morsel/has_index_inline_nulls
  index/null_readers_through_helper    morsel/has_index_ext_nulls
  index/nullmap_helper_slice           store/col_ext_nullmap_roundtrip
  index/persistence_roundtrip
  index/retain_saved_ext_nullmap

These are not covered by the Stage 3' ray_vec_nullmap_bytes sweep —
they touch the bitmap through other paths:

- morsel.c iteration publishes m->null_bits to morsel consumers,
  filled from the inline / ext nullmap.  Consumers walk those bits
  directly; need sentinel-aware iteration to keep working.
- idxop.c attach_finalize displaces the inline nullmap into
  ix->saved_nullmap and clears NULLMAP_EXT on the parent; detach
  restores.  With sentinels there is nothing to save / restore —
  the snapshot becomes a no-op.
- store/col.c persists the bitmap as an on-disk segment.  Greenfield
  rule says hard format break — drop the segment.

Revert the strip; bitmap stays dual-encoded for now.  Stage 3'' will
convert these three consumers; only then is the strip safe.
ray_morsel_next previously published m->null_bits as a pointer into
the source vec's bitmap (inline / ext / HAS_INDEX-displaced).
Production code never reads it — null_bits is consumed only by
test/test_morsel.c — but the bitmap-pointer publication blocks the
Stage 3 strip: post-strip the source bitmap is stale zeros and
null_bits would silently misreport "no nulls."

Replace the bitmap-pointer publication with on-demand synthesis:
ray_morsel_t gains a RAY_MORSEL_ELEMS/8 = 128-byte null_bits_buf
scratch field.  Per morsel, ray_morsel_next scans the current
chunk via ray_vec_is_null (sentinel-based for the sentinel-supported
types, bitmap fallback for BOOL/U8) and packs the bits into
null_bits_buf.  null_bits points into that buffer so existing
consumers — including the test fixtures — see the same byte/bit
layout as before.

Cost: one O(morsel_len) sentinel scan per chunk, paid only when
HAS_NULLS is set on the source.  morsel_len is bounded at 1024 so
this is well under a microsecond per morsel.

Stage 3'' converter #1 of 3.  Full suite: 2450/2451.  Removes the
morsel obstacle to the bitmap strip.
Updated the strip-hold comment in ray_vec_set_null_checked.  After
S3''.1 (morsel synthesis), re-attempting the strip drops the failure
count from 8 → 7 — morsel/has_index_inline_nulls now passes.  The
remaining 7 all assert bitmap-internal artifacts:

  index/attach_drop_with_ext_nullmap
  index/null_readers_through_helper
  index/nullmap_helper_slice
  index/persistence_roundtrip
  index/retain_saved_ext_nullmap
  morsel/has_index_ext_nulls
  store/col_ext_nullmap_roundtrip

These tests directly exercise the ext_nullmap pointer / NULLMAP_EXT
attribute flag / ray_vec_nullmap_bytes return value.  Post-strip
those artifacts don't exist for sentinel types.  Half are pure
implementation-detail tests of functionality being removed
(ray_vec_nullmap_bytes goes away in Stage 5); the other half could
plausibly be updated to use sentinel-based round-trip assertions.

No further code change; awaiting user direction on the test removal /
rewrite split.
ray_vec_set_null_checked now early-returns after writing the type-
correct NULL_* sentinel and setting HAS_NULLS for the sentinel types
(F64 / F32 / I16 / I32 / I64 / DATE / TIME / TIMESTAMP / STR / GUID).
The per-element bitmap is no longer maintained for these types —
all readers (ray_vec_is_null, RAY_ATOM_IS_NULL, morsel synthesis,
group/query/expr/serde kernels) went sentinel-only in S1.*–S3''.1.
BOOL / U8 retain the legacy bitmap path until Phase 1 lockdown lands.

Tests updated in place (per user direction option 2 — rewrite, no
deletion) for the 7 cases that asserted bitmap-internal artifacts
(NULLMAP_EXT attr, ext_nullmap pointer, raw nullmap[] snapshot,
ray_vec_nullmap_bytes return value).  Each rewrite drops the
bitmap-flag assertions and keeps the round-trip / null-detection
checks via ray_vec_is_null:

- test_index_attach_drop_with_ext_nullmap: still exercises
  attach + drop on a >128-element vec with nulls; asserts null
  state survives via ray_vec_is_null instead of comparing
  nullmap[] snapshots and NULLMAP_EXT flags.
- test_index_nullmap_helper_slice: drops the ray_vec_nullmap_bytes
  pointer / bit-offset assertions; keeps slice-relative
  ray_vec_is_null checks.
- test_index_null_readers_through_helper: replaces the bitmap-byte
  before/after assertions with ray_vec_is_null calls before and
  after attach (sentinel-based reads see through the union-arm
  index pointer overlay correctly).
- test_index_persistence_roundtrip: drops the pre-attach
  NULLMAP_EXT assertion (it's gone post-strip; the round-trip
  null-detection assertions are unchanged).
- test_index_retain_saved_ext_nullmap: drops the NULLMAP_EXT
  precondition; the shared-COW + drop + sentinel-read invariant
  on `b` still holds.
- test_morsel_has_index_ext_nulls: drops NULLMAP_EXT /
  saved_attrs assertions; keeps morsel iteration null_bits check
  via the synthesized buffer.
- test_col_ext_nullmap_roundtrip: drops NULLMAP_EXT / ext_nullmap
  pointer assertions on both ray_col_load and ray_col_mmap paths;
  keeps the ray_vec_is_null + data assertions.

Full suite: 2450/2451.  make audit reports 202 divergences post-
strip — expected and not noise: the audit was Stage 1 instrumentation
that cross-checked bitmap-vs-sentinel under dual encoding; now that
the bitmap is no longer written for sentinel types, divergence on
every null read is by design.  The audit instrumentation will be
removed in Stage 5.
Mirror the ray_vec_set_null_checked strip into the parallel-safe
helper.  par_set_null for sentinel-supporting types (F64 / F32 /
I16 / I32 / I64 / DATE / TIME / TIMESTAMP) now writes only the
NULL_* sentinel into the payload and atomically OR's HAS_NULLS
into vec->attrs — no bitmap bit set.  Multiple workers calling on
different idx see no contention on the payload (per-slot stores);
the attrs OR is atomic so the shared byte is safe.

BOOL/U8 fall through to the legacy bitmap path (atomic OR into the
inline / ext bit position, lazy ext alloc via ray_vec_set_null when
idx >= 128 with no NULLMAP_EXT yet).

par_prepare_nullmap and par_finalize_nulls stay as-is — they
operate only when the legacy bitmap path is active, which is now
the BOOL/U8 case only.  ext_nullmap allocation in csv.c and the
remaining BOOL/U8 paths is wasted memory for sentinel-typed columns
now but harmless; Stage 4 (storage reclamation) addresses that.

Full suite: 2450/2451.
Briefly attempted to extend the SYM rejection in
ray_vec_set_null_checked to BOOL/U8 (matching the Phase 1 design
intent that those types are non-nullable).  The lockdown produces
11 test failures across exec/expr_*_nullable, sort/{u8,bool}_*,
vec/null_external*, store/serde_vec_null_bitmaps, rfl/null/cast,
rfl/collection/distinct — all exercise BOOL/U8 nullable scenarios
that Phase 1 says shouldn't exist.

Per the migration test-handling convention (rewrite tests in place,
no deletion), each of these needs a sentinel-based rewrite or a
documented "this scenario can't exist post-Phase-1" assertion.
Reverted the lockdown to keep the session at a clean checkpoint;
queued for the BOOL/U8 lockdown task in TaskList.
Phase-1 plan moved BOOL/U8 to no-nullability semantics; this finishes
the lockdown now that the bitmap arm is gone in the union.

Three fused-evaluator gaps surfaced once the lockdown removed the
HAS_NULLS shortcut that previously rejected fused compilation for
narrow integer columns.  All three were latent bugs (the unreached
branches were no-ops, leaving destination scratch buffers
uninitialised); only post-lockdown can the code paths actually
exercise them:

  * expr_exec_unary OP_CAST I64→I64 — sole-buffer view is a no-op,
    but src and dst are separate scratch slots, so dst must receive
    the data via memcpy.  Same for F64→F64.
  * expr_exec_unary OP_CAST BOOL→I64 — BOOL scratch is 1 byte/elem,
    previously fell into the F64 fallback and read past the buffer.
  * expr_exec_binary OP_AND / OP_OR for the dt=BOOL t1=t2=I64
    arm (BOOL cols loaded as I64 abstract via expr_load_i64) and the
    plain dt=I64 t1=t2=I64 arm.

cast_vec_copy_nulls now early-returns for BOOL/U8 destinations: the
cast loop has already written the type's zero value at any source-null
position, and ray_vec_set_null on the dest would otherwise return
RAY_ERR_TYPE.

Eleven tests rewritten in place to reflect the lockdown:
  * test/test_vec.c — null_external pair switched to I16, then assert
    that set_null on a BOOL/U8 vec returns RAY_ERR_TYPE.
  * test/test_sort.c — three sort-with-nulls tests collapsed to
    lockdown assertions (BOOL/U8 inputs are no-null).
  * test/test_store.c — serde BOOL stanza now checks non-null
    round-trip behaviour.
  * test/rfl/null/cast.rfl — `(as 'B8 [1 0N 3])` collapses the null,
    so nil? sum is 0 not 1.
  * test/rfl/collection/distinct.rfl — `(nil? (at (as 'U8 ...) 1))`
    is false post-lockdown.
  * test/test_exec.c — expr_unary_cast_narrow_nullable expects 6 / 2
    counts; expr_binary_u8_nullable expects 48 / 105 / 15;
    expr_binary_bool_nullable expects 2 / 4.

2450 / 2451 passing (1 skipped, 0 failed) under DEBUG_CFLAGS
(ASan + UBSan).
hetoku added 8 commits May 18, 2026 15:46
  * vec/vec.c ray_vec_set_null_checked — every remaining vec type uses
    a sentinel (BOOL/U8/SYM rejected up top by S3.3); the inline-128
    and ext-promotion branches were unreachable after lockdown.  The
    function now writes the type-correct NULL_* into the payload, sets
    HAS_NULLS, and returns.

  * ops/expr.c set_all_null — drops the dual-encoding holdover that
    ensured ext_nullmap allocation and 0xFF-filled the bitmap.  The
    sentinel payload fill is the sole source of truth read by
    ray_vec_is_null and the morsel synthesis path.

  * ops/expr.c nullmap_bits_mut — deleted; its only caller was the
    set_all_null bitmap fill.

2450 / 2451 passing under ASan + UBSan (1 skipped, 0 failed).
The on-disk null encoding is now the type-correct sentinel in the
payload, matching the in-memory contract.

Save side (col_save_*):
  * No bitmap append after the data region.
  * Header rebase for HAS_INDEX no longer carries the saved
    NULLMAP_EXT bit, and the SLICE / NULLMAP_EXT bits are scrubbed
    unconditionally before writing.

Load side (col_validate_mapped / ray_col_load / col_mmap_impl):
  * Files whose header still has NULLMAP_EXT set are rejected as
    corrupt (greenfield — no backwards-load path).
  * col_restore_ext_nullmap, the has_ext_nullmap / bitmap_len fields
    in col_mapped_t, and the appended-bitmap size accounting in the
    mmap layout check are all gone.  Renamed bitmap_offset →
    tail_offset to reflect that it now marks file end.

2450 / 2451 passing under ASan + UBSan (1 skipped, 0 failed).
CSV row parsers were already writing the type-correct sentinel into
col_data[c] at every null cell; the parallel bitmap write was the
dual-encoding holdover.  With the sentinel as sole truth and
ray_vec_is_null reading payload, the bitmap is dead weight.

  * read_csv setup: drop the per-column inline/external bitmap
    allocation.  col_had_null[] stays as the HAS_NULLS bookkeeping.
  * Parser functions (csv_parse_fn / csv_parse_serial /
    csv_intern_strings / csv_fill_str_cols) lose their col_nullmaps
    parameter and the `bits |= 1 << bit` writes from every per-type
    case.
  * ctx structs (csv_finalize_ctx_t, csv_par_ctx_t) lose their
    col_nullmaps field.
  * Post-parse strip: a single HAS_NULLS attr toggle replaces the
    bitmap-release / inline-zero branching.  SYM narrowing path
    copies HAS_NULLS without touching any nullmap bytes.
  * Splayed writer: removes the entire side-channel that wrote a
    per-column null bitmap file (null_fp / null_tmp_path / null_acc
    / csv_splayed_writer_null_bit / csv_splayed_writer_zero_nulls)
    and the tail-concat in csv_splayed_writer_close.  Sentinel
    payload alone carries null state on disk.

After: `grep -c "ext_nullmap\|col_nullmaps\|NULLMAP_EXT" src/io/csv.c` → 0.

2450 / 2451 passing under ASan + UBSan (1 skipped, 0 failed).
  * store/serde.c — null state is sentinel-encoded in the payload, so
    the wire format no longer carries a trailing bitmap region.
    Removed null_bitmap_size / ser_null_bitmap / de_null_bitmap and
    every call site; size and (de)serializer paths shrink to just
    `type + attrs + len + data`.  HAS_NULLS in the attrs byte tells
    the decoder to flip the same bit on the reconstructed vec —
    ray_vec_is_null recovers null state from the sentinel payload.

  * ops/linkop.c — removed promote_inline_to_ext.  Sentinel-encoded
    nulls don't consume the union arm, so ray_link_attach writes
    link_target into bytes 8-15 unconditionally.  Cleaned up the SYM
    sym_dict propagation guard in link_deref_array_for_target_col to
    drop the now-vacuous NULLMAP_EXT branch.

  * test/test_link.c — test_link_with_inline_nulls_promotes was
    asserting NULLMAP_EXT on the linked vec.  Rewritten in place to
    reflect the new contract: HAS_LINK + HAS_NULLS coexist freely on
    a sentinel-encoded column.

2450 / 2451 passing under ASan + UBSan (1 skipped, 0 failed).
With the producer-side ext-bitmap arm fully retired (S4.4), the bit
0x20 on vectors and the ext_nullmap pointer it gated are never set
anywhere in src/.  This commit reclaims the dead surface:

  - vec.c: delete ray_vec_nullmap_bytes (public), vec_inline_nullmap,
    read_nullmap_bit, the RAYFORCE_NULL_AUDIT cross-check + audit
    infrastructure (pthread mutex, backtrace, dedup table), and the
    vec_drop_index_inplace NULLMAP_EXT restore.  ray_vec_is_null
    becomes a clean sentinel-only reader; the default arm is
    unreachable in practice (BOOL/U8 locked out at producer) and
    returns false.
  - vec.h: drop ray_vec_nullmap_bytes declaration.
  - idxop.{c,h}: saved_attrs now records HAS_NULLS only; the SYM/STR
    saved-pointer branches in release_saved/retain_saved were
    unreachable (only numeric vecs may attach) — both functions are
    now documented no-ops.  attach_finalize / ray_index_drop stop
    propagating the dead NULLMAP_EXT bit.
  - heap.{c,h}: drop RAY_ATTR_NULLMAP_EXT release/retain branches in
    ray_release_owned_refs and ray_retain_owned_refs, the
    NULLMAP_EXT-aware mmod==1 munmap size add-on, and the
    ray_detach_owned_refs branch.  Remove the #define.  Update the
    bit-layout comment to note 0x20 is reserved (legacy on-disk).
  - store/col.c: drop the on-save NULLMAP_EXT scrub (no producer
    sets it anyway).  Keep the on-load reject as a local
    LEGACY_DISK_NULLMAP_EXT_BIT constant for malformed legacy files.
  - ops/internal.h: par_set_null sheds the bitmap fallback (BOOL/U8
    are non-nullable).  par_prepare_nullmap becomes a no-op.
    par_finalize_nulls scans payload sentinels for the post-execution
    HAS_NULLS check (matches the sentinel-only producer contract).
  - sort.c / lang/eval.c / ops/collection.c / ops/rerank.c: the
    sym_dict propagation guards no longer need the
    `(!HAS_NULLS || NULLMAP_EXT)` clause — bytes 0-7 are no longer
    overwritten by an inline bitmap.
  - Makefile: drop the `audit` target (the cross-check it built is gone).
  - include/rayforce.h: the ext_nullmap union field STAYS (it
    aliases sym_dict in the same struct arm); update the bytes-0-15
    layout comment to reflect the new reality.

Tests:
  - test_heap.c: rewrite test_nullmap_ext_owned_ref →
    test_sentinel_null_release; test_scratch_realloc_nullmap_ext →
    test_scratch_realloc_sentinel_nulls.
  - test_index.c: rewrite test_index_attach_drop_with_ext_nullmap →
    test_index_attach_drop_large_sentinel_nulls,
    test_index_retain_saved_ext_nullmap →
    test_index_drop_shared_with_large_nulls, and
    test_index_release_saved_str_sym / retain_saved_str_sym →
    test_index_release_saved_noop / retain_saved_noop (since both
    helpers are now no-ops).  Drop NULLMAP_EXT from the snapshot
    struct.  Trim narrative comments on the remaining tests.
  - test_store.c: rewrite test_col_ext_nullmap_roundtrip →
    test_col_large_nullable_roundtrip;
    test_col_validate_mapped_bitmap_truncated →
    test_col_validate_mapped_legacy_ext_bitmap_rejected (now exercises
    the on-load legacy-format reject).
  - test_morsel.c / test_vec.c: trim historical comments.
  - test/rfl/{mem/heap_coverage,ops/internal_coverage}.rfl: trim
    NULLMAP_EXT narrative.

Build clean with -Werror.  Full suite: 2450 of 2451 passed
(1 pre-existing skip, 0 failed).
No code reads or writes either field anymore (S4.4 finished off the
last allocators).  Replace each with an `_aux_*_lo[8]` byte-padding
field so the surviving sym_dict / str_pool pointers stay at offset 8
without aliasing a misleading dead name.  The `nullmap[16]` raw-byte
view stays — atoms, env, table/dict/list zero-init, col on-disk
headers, and idxop's saved snapshot all keep using it.

Grep across src/, test/, include/ shows zero remaining `ext_nullmap`
references.

2450 / 2451 passing under ASan + UBSan (1 skipped, 0 failed).
The sentinel-null migration is complete; this commit removes the
historical narrative (Phase 1/2/3/3a/3b/7, dual encoding, bitmap arm,
NULLMAP_EXT, ext_nullmap, lockdown, etc.) from comments, doc strings,
and test descriptions, leaving only descriptions of what the code does
today.  Algorithmic phase markers (Phase 1: histogram → Pass 1:
histogram, etc.) were renamed mechanically so the verification grep
returns no matches for migration vocabulary.

- Delete completed plan/spec:
  docs/superpowers/plans/2026-05-18-sentinel-migration-finish.md
  docs/superpowers/specs/2026-05-18-sentinel-migration-finish-design.md
- Delete the legacy on-disk-format guard in src/store/col.c (the 0x20
  attribute bit it rejected is no longer produced anywhere; greenfield)
  and the corresponding store/col_validate_legacy_ext_bitmap_rejected
  test.
- Rename test/rfl/null/{bool_u8_lockdown,f64_dual_encoding,
  integer_dual_encoding}.rfl to neutral names.
- Rewrite block comments narrating the migration (rayforce.h NULL_*
  paragraph, atom.c ray_typed_null doc, vec.c set/copy_nulls,
  group.c per-agg sentinel guards, expr.c null propagation,
  query.c UPDATE/CSV null paths, etc.) so they describe the current
  contract without the historical evolution.
- Scrub stray "post-sentinel-migration" / "Phase 3a-13" / "Phase 1
  lockdown" / "bitmap arm" mentions from test fixtures and rfl tests.
The macro declares `int64_t v = (int64_t)(VAL_EXPR)`, then several
call sites pass a local also named `v` — the new declaration is in
scope before the initializer runs, so the initializer self-references
the uninitialized new `v`.  clang catches this with -Wuninitialized;
gcc at -O3 catches it with -Wmaybe-uninitialized.

Rename the macro's locals (`_ins_v`, `_ins_h`, `_ins_slot`,
`_ins_cur`) so they cannot collide with caller scope.

Fixes the macOS debug, macOS release, and Ubuntu release CI jobs.
@singaraiona singaraiona merged commit 9dedb7f into master May 18, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants