Skip to content

Save computed normalization stats (audio_mean, audio_std, spec_mean, spec_std) to JSON #15

@riley-1995

Description

@riley-1995

Context

Teammates need access to normalization statistics produced by the current preprocessing workflow.
These stats are currently computed in code but not persisted in a standard artifact.

Goal

Persist normalization stats to a JSON file as part of preprocessing so downstream users/scripts can reliably consume them.

Requirements

  • Save the following keys in JSON:
    • audio_mean
    • audio_std
    • spec_mean
    • spec_std
  • Use current preprocessing pipeline outputs (no Audio-Processing integration dependency for this issue).
  • Ensure JSON is written deterministically to a documented location.
  • Update README with:
    • where file is written
    • how it is generated
    • example JSON snippet
  • Keep this change minimal and mergeable quickly.

Suggested Output Path

  • data/normalization_stats.json
    (or another repo-consistent path; document final choice in README)

Acceptance Criteria

  • Running preprocessing produces a JSON file with all 4 required keys.
  • Values are numeric and non-null when data exists.
  • README documents generation + location.
  • PR includes example output in description/log.

Out of Scope

  • Audio-Processing submodule integration
  • Refactoring preprocessing architecture
  • Broader training pipeline redesign

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions