Add Korean Whisper models with automatic CTranslate2 conversion by hoiyada7-maker · Pull Request #7 · parkscloud/Hearsay

hoiyada7-maker · 2026-05-30T02:19:10Z

Summary

small-ko (SungBeom/whisper-small-ko, 244M): 한국어 특화 Whisper 소형 모델
medium-ko-zeroth (seastar105/whisper-medium-ko-zeroth, 769M): Zeroth 코퍼스 파인튜닝 중형 모델

모델 선택 시 HuggingFace에서 다운로드 후 CTranslate2 int8 형식으로 자동 변환, 이후에는 캐시 재사용.

Changes

constants.py: HF_CUSTOM_MODELS 추가 (repo_id, 파라미터 수, VRAM 정보)
model_manager.py: HF 모델 다운로드 + CTranslate2 변환 파이프라인 구현
- is_hf_custom_model(), resolve_model_path(), get_hf_model_local_path()
- ct2-transformers-converter 자동 탐색 (pip --user 경로 포함)
engine.py: load() 에서 resolve_model_path() 호출 → 로컬 CTranslate2 경로 지원
settings_window.py: 모델 변경 시 다운로드 UI 추가
- 커스텀 모델 선택 시 힌트 표시 ("다운로드 필요" / "변환 완료")
- Save 클릭 → 미변환 모델이면 진행 바 표시 후 백그라운드 변환, 완료 시 자동 저장·닫기
requirements.txt: transformers>=4.23.0 추가

Test Results

✅ 모델 목록에 small-ko, medium-ko-zeroth 표시
✅ small-ko 다운로드 → CTranslate2 변환 완료 (model.bin 생성)
✅ 이미 변환된 모델은 재다운로드 없이 스킵
✅ engine.load() 로컬 CTranslate2 모델 로드 성공
✅ 추론 정상 동작 확인

🤖 Generated with Claude Code

- Add HF_CUSTOM_MODELS dict with 2 Korean models: * SungBeom/whisper-small-ko (small-ko) * seastar105/whisper-medium-ko-zeroth (medium-ko-zeroth) - Implement automatic CTranslate2 int8 conversion on first use - Add model_manager functions: is_hf_custom_model, resolve_model_path, download_and_convert - Update engine.load() to use resolve_model_path for local CTranslate2 models - Enhance SettingsWindow with model download progress UI and status hints - Skip re-download if model already converted (caching) - Fix ct2-transformers-converter discovery for pip --user installs - Add transformers>=4.23.0 dependency Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Replace fixed 30s chunks with adaptive silence-based cuts: - Add _ChunkAccumulator: buffers audio, cuts when >=5s buffered and 1s trailing silence detected, or unconditionally at 30s hard cap (Whisper context window) - Each chunk carries absolute start_time so timestamps remain accurate across variable-length chunks — eliminates chunk_index*30 drift from overlap - Both-mode silence detection combined across loopback+mic (cut only when both quiet) - Pipeline and MarkdownWriter updated to consume (chunk_index, start_time, audio) tuples - First transcription text now appears in ~10s vs ~42s with the old fixed 30s chunks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

hoiyada7-maker and others added 2 commits May 30, 2026 11:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Korean Whisper models with automatic CTranslate2 conversion#7

Add Korean Whisper models with automatic CTranslate2 conversion#7
hoiyada7-maker wants to merge 2 commits into
parkscloud:masterfrom
hoiyada7-maker:feature/korean

hoiyada7-maker commented May 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hoiyada7-maker commented May 30, 2026

Summary

Changes

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant