Skip to content

ESP32 8KB CSI embedding v2 — retract single-class '100% presence', ship honest 82.3% held-out triplet metric #882

@ruvnet

Description

@ruvnet

ESP32 8KB CSI embedding — v2 honest re-benchmark + converged encoder

🤗 https://huggingface.co/ruvnet/wifi-densepose-pretrained (v2 files added; v1 kept for compatibility)

What was wrong with v1

  • The v1 contrastive encoder logged a flat training loss (0.13517 every epoch) — the optimizer wasn't learning.
  • Its "100% presence accuracy" headline was measured on a single-class recording: an overnight capture of one sleeping person, 6,062 / 6,063 frames labelled "present", 1 "absent". A constant "yes" predictor scores 99.98% — so the figure is real but says nothing about generalization. Retracted.

v2 fix (honest, label-free, time-disjoint)

Retrained the same 8→64→128 encoder with a working InfoNCE objective. Metric: held-out temporal-triplet accuracy = P(d(anchor, temporal-positive) < d(anchor, temporal-negative)), evaluated on the last 20% of the recording by time (no leakage).

Encoder Held-out temporal-triplet acc
Raw 8-dim features 66.4%
Random-init encoder 69.6%
v2 trained encoder 82.3% (+15.9 pts over raw, properly converged)
  • 4-bit packed encoder + fp16 standardizer = 4.56 KB (fits the 8 KB ESP32 SRAM budget).
  • Encoder weights SHA-256: 3b37bca66e6050c50ccbc0f6e0501824f258bfdd8675dc0f4541b1e2e96feecd
  • Repro: python aether-arena/staging/train_csi_embed.py; data data/recordings/overnight-1775217646.csi.jsonl (6,063 feature frames).

Honest scope

One room, one capture, two nodes. The triplet metric measures embedding quality, not downstream presence/vitals accuracy (needs multi-class, multi-room labelled data we don't have yet for this 2.4 GHz feature). For pose SOTA on a public benchmark see the separate 5 GHz model ruvnet/wifi-densepose-mmfi-pose (82.69% torso-PCK@20 on MM-Fi), tracked in #880.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions