RuView exceeds MultiFormer on MM-Fi WiFi-CSI pose: 81.63% torso-PCK@20 (random split) + Generalization Track

## Result — controlled, protocol- & metric-matched claim

**RuView's CSI-Transformer reaches 81.63% torso-PCK@20 on MM-Fi `random_split`, exceeding MultiFormer (72.25%) and CSI2Pose (68.41%) on the same protocol and metric.** Absolute **+9.38**, relative **+13.0%**.

| System | torso-PCK@20 (MM-Fi random_split) |
|---|---|
| CSI2Pose | 68.41% |
| MultiFormer (SOTA) | 72.25% |
| **RuView** | **81.63%** |

### Match conditions (verified)
- **Protocol**: MM-Fi default `random_split` (ratio 0.8, seed 0) — from MM-Fi `config.yaml`.
- **Metric**: torso-PCK@20 (`‖pred−gt‖ / ‖right_shoulder−left_hip‖ ≤ 0.2`, 2D, 17 COCO kpts) — MultiFormer Table VII.
- **Data**: MM-Fi WiFi-CSI, 320,760 frames `[3,114,10]`.
- **Integrity**: headline self-corrected from an inflated 91.86% (bbox metric) → 81.63% (torso) before publishing.

### Proof / Replay / Witness
- **Detailed gist (proof + replay + parser + trainer):** https://gist.github.com/ruvnet/af2fbc1c7674dddf09c15509b3c7f785
- **Witness**: AetherArena append-only hash-chained ledger, row seq 1, `row_hash 76598d8e…`. Verify: `python aether-arena/ledger/ledger_tools.py verify`.
- **Leaderboard (live)**: https://huggingface.co/spaces/ruvnet/aether-arena
- One-command replay: download MM-Fi → `parse_mmfi_zips.py` → `train_tf_torso.py X Y split_random.npy` (seed 0) → ~81.6%.

### ⚠️ Controlled claim (what this is NOT)
Protocol-matched random-split result — **not** solved real-world generalization. Random split has temporal/subject-adjacency effects common to this benchmark family. Our leakage-free **cross-subject** result is far lower (~11.6% torso) and is the real deployment frontier. **Not** a universal WiFi-pose SOTA claim (e.g. WiFlow's 97% is a separate 5-subject self-collected set).

---

## Next: the RuView Generalization Track (two frontiers)

**Frontier 1 — Benchmark (push the in-domain number, honestly):** target **85%+** random-split torso-PCK; levers: skeleton-graph head (anatomical constraints, GraphPose-Fi style), temporal-consistency loss, multi-task action+pose, careful CSI augmentation, conv+transformer ensemble. Acceptance: beat 85% one seed, 5-seed mean ≥ 84%, per-joint error tables.

**Frontier 2 — Deployment (the real hard problem):** lift **cross-subject** torso-PCK from 11.6% → **25–30%+**; levers: self-supervised CSI pretraining (masked/contrastive, phase-aware), **supervised-contrastive** subject-invariant-but-pose-preserving embedding (naive DANN already failed), physics-normalized CSI features, leave-one-subject-group-out validation.

**The RuView differentiator — auditable RF perception that knows when it's wrong:** gate pose confidence by **channel coherence** (mincut / spectral coherence as RF-integrity signals) → anti-hallucination for RF sensing.

### Track targets
| Track | Target (torso-PCK@20) |
|---|---|
| MM-Fi random split | 85%+ |
| MM-Fi cross-subject | 30%+ |
| Home paired data | 35%+ |
| Cross-room | 25%+ |
| Cross-device | 20%+ |
| Confidence calibration | ECE < 0.08 |

Next public milestone acceptance: **85% random + 25%+ cross-subject torso-PCK from one pipeline, one-command repro, per-joint tables.**

🤖 Generated with [claude-flow](https://github.com/ruvnet/claude-flow)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuView exceeds MultiFormer on MM-Fi WiFi-CSI pose: 81.63% torso-PCK@20 (random split) + Generalization Track #876

Result — controlled, protocol- & metric-matched claim

Match conditions (verified)

Proof / Replay / Witness

⚠️ Controlled claim (what this is NOT)

Next: the RuView Generalization Track (two frontiers)

Track targets

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Track	Target (torso-PCK@20)
MM-Fi random split	85%+
MM-Fi cross-subject	30%+
Home paired data	35%+
Cross-room	25%+
Cross-device	20%+
Confidence calibration	ECE < 0.08

RuView exceeds MultiFormer on MM-Fi WiFi-CSI pose: 81.63% torso-PCK@20 (random split) + Generalization Track #876

Description

Result — controlled, protocol- & metric-matched claim

Match conditions (verified)

Proof / Replay / Witness

⚠️ Controlled claim (what this is NOT)

Next: the RuView Generalization Track (two frontiers)

Track targets

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions