Context
Teammates need access to normalization statistics produced by the current preprocessing workflow.
These stats are currently computed in code but not persisted in a standard artifact.
Goal
Persist normalization stats to a JSON file as part of preprocessing so downstream users/scripts can reliably consume them.
Requirements
- Save the following keys in JSON:
audio_mean
audio_std
spec_mean
spec_std
- Use current preprocessing pipeline outputs (no Audio-Processing integration dependency for this issue).
- Ensure JSON is written deterministically to a documented location.
- Update README with:
- where file is written
- how it is generated
- example JSON snippet
- Keep this change minimal and mergeable quickly.
Suggested Output Path
data/normalization_stats.json
(or another repo-consistent path; document final choice in README)
Acceptance Criteria
- Running preprocessing produces a JSON file with all 4 required keys.
- Values are numeric and non-null when data exists.
- README documents generation + location.
- PR includes example output in description/log.
Out of Scope
- Audio-Processing submodule integration
- Refactoring preprocessing architecture
- Broader training pipeline redesign
Context
Teammates need access to normalization statistics produced by the current preprocessing workflow.
These stats are currently computed in code but not persisted in a standard artifact.
Goal
Persist normalization stats to a JSON file as part of preprocessing so downstream users/scripts can reliably consume them.
Requirements
audio_meanaudio_stdspec_meanspec_stdSuggested Output Path
data/normalization_stats.json(or another repo-consistent path; document final choice in README)
Acceptance Criteria
Out of Scope