Skip to content

WhisperKitPro: local model dir without Hugging Face defaults#98

Merged
dbrkn merged 1 commit intomainfrom
arda/whisperkitpro-local-model-pipeline
Apr 8, 2026
Merged

WhisperKitPro: local model dir without Hugging Face defaults#98
dbrkn merged 1 commit intomainfrom
arda/whisperkitpro-local-model-pipeline

Conversation

@arda-argmax
Copy link
Copy Markdown
Contributor

Summary

Adds a whisperkitpro-local-model pipeline alias for running WhisperKitPro against a local model directory only (no default Hugging Face repo_id / model_variant), and extends WhisperKitProConfig so model_dir alone selects --model-path without downloading.

Changes

  • WhisperKitProConfig / WhisperKitPro
    • model_dir only: if set to a non-empty path, require an existing directory and use it for --model-path; no snapshot_download.
    • HF path: unchanged when model_dir is unset and both repo_id and model_variant are set (non-empty).
    • Legacy mode: unchanged; validation now errors if none of local model_dir, HF pair, or legacy triple is valid.
    • Treat empty strings like missing values via _config_str_provided.
  • Pipeline alias whisperkitpro-local-model
    • Defaults: model_dir from WHISPERKITPRO_LOCAL_MODEL_PATH, cli_path from WHISPERKITPRO_CLI_PATH.
  • Docs / config copy
    • WhisperKitProTranscriptionConfig and WhisperKitProConfig field descriptions updated to match the three modes (local / HF / legacy).

How to run

export WHISPERKITPRO_CLI_PATH=/path/to/whisperkitpro-cli
export WHISPERKITPRO_LOCAL_MODEL_PATH=/path/to/existing/model/dir

LOCAL_DATASET_PATH=/path/to/dataset LOCAL_DATASET_SPLIT=test \
uv run openbench-cli evaluate \
  --pipeline whisperkitpro-local-model \
  --dataset local-dataset \
  --metrics wer keyword_fscore keyword_precision keyword_recall

Breaking / behavior note

If model_dir is set but does not exist, the run fails with FileNotFoundError; there is no fallback to Hugging Face (even when repo_id / model_variant are also set).

@arda-argmax arda-argmax requested review from EduardoPach and dbrkn April 8, 2026 01:01
Copy link
Copy Markdown
Collaborator

@EduardoPach EduardoPach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

Copy link
Copy Markdown
Contributor

@dbrkn dbrkn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dbrkn dbrkn merged commit a529f39 into main Apr 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants