feat(distill): APR_DISTILL_MAX_STEPS smoke-validation mode (PMAT-706) by noahgift · Pull Request #1888 · paiml/aprender

noahgift · 2026-05-22T14:08:04Z

Summary

`APR_DISTILL_MAX_STEPS=N` runs at most N training steps, prints loss-trajectory + projected wall-time summary, exits without writing output. Operators validate a 30-50 h Stage D cascade in ~60 s.

Closes the diagnostic loop on the PMAT-704 cascade. PMAT-705 (#1881) surfaced per-step loss; PMAT-706 adds the early-break so operators don't wait through the full epoch budget.

Changes

`crates/aprender-train-distill/src/pipeline.rs`:

env vars: `APR_DISTILL_MAX_STEPS=N` (early-break) + `APR_DISTILL_PROJECT_TO_STEPS` (default 50000, sets projection target)
N=0 or invalid → early `Err` with clear message
train loop breaks at `step >= N`, prints two `[SMOKE]` summary lines
`execute()` short-circuits export when smoke mode → no `model.safetensors` / `output.apr` written

Contract

`contracts/apr-distill-smoke-validation-v1.yaml`:

3 equations + 4 falsifiers + 2 Kani harnesses + qa_gate F-SMOKE-001
Validates clean (`pv validate` 0/0)

Tests

4 unit tests in `pmat_706_smoke_validation`, all PASS (serialized via Mutex to avoid env race):

`falsify_smoke_001_exact_step_count`
`falsify_smoke_002_no_regression_when_unset`
`falsify_smoke_004_no_output_in_smoke`
`smoke_zero_steps_returns_err`

Output

```
[PMAT-706] smoke mode: APR_DISTILL_MAX_STEPS=10 (early-break after 10 steps; no final output.apr written)
...
[SMOKE] 10 steps in 1.20s: initial_loss=3.4567, final_loss=3.1234, throughput=8.33 step/s
[SMOKE] projected full-run wall time (50000 steps): 1.67h / 100.0 min / 6000s
[PMAT-706] smoke mode: skipping export — no model.safetensors / output.apr written
```

Methodology

Per memory `feedback_a_priori_theoretical_falsification.md`: 30 min of math saves 8 h of GPU. PMAT-706 is the runtime analog — 60 s of smoke saves 8 h of staring at a silent process.

🤖 Generated with Claude Code

When the operator sets `APR_DISTILL_MAX_STEPS=N` (default unset), the distill training loop runs at most N steps, prints a per-run summary, and exits without writing a final output model. Lets operators validate the cascade end-to-end in ~60 s before committing to a 30-50 h Stage D production run. The PMAT-704 cascade post-mortem found that the 7B vocab-aligned 500-step validation hung at step 0 for 1.5 h with no per-step output. PMAT-705 (#1881) added ProgressCallback to surface per-step loss during normal runs. PMAT-706 adds the complementary EARLY-BREAK so operators don't have to wait through the full epoch budget to see if something's wrong. ## Changes `crates/aprender-train-distill/src/pipeline.rs`: * Reads `APR_DISTILL_MAX_STEPS` env var. Empty/unset = old behavior (no regression). N > 0 = run at most N steps then break. N = 0 or non-integer = early Err with clear message. * Optional `APR_DISTILL_PROJECT_TO_STEPS` env var (default 50000) controls the projected-wall-time target in the summary. * `train()` early-breaks the inner loop when step >= max_steps, prints two `[SMOKE]` summary lines (loss trajectory + projected wall time at the observed throughput), and returns empty weights / shapes via the normal Result path. * `execute()` detects smoke mode (env var set) and short-circuits the export step — no `model.safetensors` / output.apr is written, so downstream tools (`apr eval`, `apr run`) can't accidentally consume a smoke result. ## Summary format [PMAT-706] smoke mode: APR_DISTILL_MAX_STEPS=N (early-break after N steps; no final output.apr written) ... [SMOKE] N steps in T.TTs: initial_loss=X.XXXX, final_loss=Y.YYYY, throughput=Z.ZZ step/s [SMOKE] projected full-run wall time (50000 steps): H.HHh / W.W min / S.Ss [PMAT-706] smoke mode: skipping export — no model.safetensors / output.apr written ## Contract `contracts/apr-distill-smoke-validation-v1.yaml`: * 3 equations: early_break_condition (off-by-one tight), smoke_summary_format, no_side_effects. * 4 falsifiers (FT-SMOKE-001..004) covering exact step count, no-regression when unset, summary line format, no output.apr written. * 2 Kani harnesses (count is tight; 0 steps is degenerate, not panic). * qa_gate F-SMOKE-001. * Validates clean: `pv validate` 0 errors, 0 warnings. ## Tests `pipeline::tests::pmat_706_smoke_validation`: * `falsify_smoke_001_exact_step_count` — N=10 returns metrics.steps_completed == 10 * `falsify_smoke_002_no_regression_when_unset` — unset → full epochs run * `falsify_smoke_004_no_output_in_smoke` — output_path empty + no model.* files * `smoke_zero_steps_returns_err` — N=0 returns Err Tests share global env state; serialized via a Mutex (ENV_LOCK) so they don't race in parallel threads. All 4 PASS. ## Methodology This closes the diagnostic loop on the PMAT-704 cascade post-mortem lesson. Per memory `feedback_a_priori_theoretical_falsification.md`: 30 min of math saves 8 h of GPU. PMAT-706 is the runtime analog: 60 s of smoke saves 8 h of staring at a silent process. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

noahgift enabled auto-merge (squash) May 22, 2026 14:08

Merge branch 'main' into feat/apr-distill-smoke-only-pmat-706

ded9fb6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(distill): APR_DISTILL_MAX_STEPS smoke-validation mode (PMAT-706)#1888

feat(distill): APR_DISTILL_MAX_STEPS smoke-validation mode (PMAT-706)#1888
noahgift wants to merge 2 commits into
mainfrom
feat/apr-distill-smoke-only-pmat-706

noahgift commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

noahgift commented May 22, 2026

Summary

Changes

Contract

Tests

Output

Methodology

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant