feat: GPU support, hotkey, beep notifications, and clipboard copy by hoiyada7-maker · Pull Request #8 · parkscloud/Hearsay

hoiyada7-maker · 2026-06-01T11:27:49Z

Summary

GPU detection: Switch from torch to ctranslate2; auto-register NVIDIA pip-package DLL dirs on startup so CUDA Toolkit is not required
CUDA error handling: Dialog when GPU runtime DLLs are missing — 'Switch to CPU' or 'Install CUDA Toolkit' instead of a silent crash
HuggingFace models: Custom HF Whisper model support with CTranslate2 conversion; add transformers/torch to requirements
VAD chunking: Variable-length audio chunking with VAD silence detection (cherry-picked from feature/korean)
Hotkey: Global Ctrl+Alt+R to toggle recording, configurable from Settings with live key capture
Beep notifications: Optional beeps for recording start, stop, and transcript save
Clipboard: Optional copy of transcript body to clipboard on save
UI: Translate all Korean strings to English; fix exc scope NameError in download error handler (Python 3.12+)

Test plan

GPU inference works without CUDA Toolkit installed (pip packages only)
CUDA error dialog appears and Switch to CPU restarts recording
Korean HF model downloads, converts, and transcribes correctly
Ctrl+Alt+R starts/stops recording; hotkey rebindable from Settings
Beeps play at correct moments and can be individually disabled
Clipboard contains transcript body after save when option is enabled
No Korean text visible anywhere in the UI

faster-whisper uses ctranslate2 as its inference backend, not PyTorch. The previous detection relied on `import torch` which was never listed as a dependency, causing GPU detection to silently fall back to CPU for all users regardless of their hardware. Switch to `ctranslate2.get_cuda_device_count()` so detection reflects the same CUDA stack that actually runs inference. torch is still used opportunistically for GPU name and VRAM info when available, with a name-based VRAM lookup table as a fallback. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When torch is absent, fall back to nvidia-smi for the actual GPU name and VRAM (MiB → GB), so the UI shows the real device name instead of the generic "CUDA Device 0" label. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When the CUDA driver is present but CUDA Toolkit 12.x is not installed (cublas64_12.dll etc. missing), ctranslate2 and faster-whisper crash at runtime rather than at device detection time. - gpu_detect: probe the CUDA runtime with a tiny StorageView allocation before reporting cuda_available=True; warns and falls back to CPU when runtime DLLs are absent. - engine: catch RuntimeError on load() for "cannot be loaded" and automatically retry with device=cpu / compute_type=int8 so the app stays functional without a hard crash. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When device=cuda is configured but CUDA runtime DLLs are missing, engine.load() now raises CudaUnavailableError instead of silently switching to CPU. app.py catches this on the background loader thread and posts a dialog to the main thread offering two actions: - "CPU로 변경": updates and saves config, restarts recording on CPU - "CUDA Toolkit 설치": opens the NVIDIA download page in the browser Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Users who install nvidia-cublas-cu12 / nvidia-cuda-runtime-cu12 via pip no longer need the full CUDA Toolkit. On startup, cuda_dlls.py scans all site-packages roots for nvidia/*/bin directories and registers each one with os.add_dll_directory() before ctranslate2 is imported. Works for any user regardless of Python install path (user site-packages, venv, or system), so cublas64_12.dll and friends are always discoverable. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ments Allows pip install -r requirements.txt to pull CUDA runtime DLLs automatically, enabling GPU inference without a full CUDA Toolkit install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… them os.add_dll_directory() covers Python extension module loading but not ctranslate2's internal ctypes.CDLL("cublas64_12") calls, which only search PATH on Windows. Now both mechanisms are set so cublas64_12.dll is found at inference time as well as at import time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add HF_CUSTOM_MODELS dict with 2 Korean models: * SungBeom/whisper-small-ko (small-ko) * seastar105/whisper-medium-ko-zeroth (medium-ko-zeroth) - Implement automatic CTranslate2 int8 conversion on first use - Add model_manager functions: is_hf_custom_model, resolve_model_path, download_and_convert - Update engine.load() to use resolve_model_path for local CTranslate2 models - Enhance SettingsWindow with model download progress UI and status hints - Skip re-download if model already converted (caching) - Fix ct2-transformers-converter discovery for pip --user installs - Add transformers>=4.23.0 dependency Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

Replace fixed 30s chunks with adaptive silence-based cuts: - Add _ChunkAccumulator: buffers audio, cuts when >=5s buffered and 1s trailing silence detected, or unconditionally at 30s hard cap (Whisper context window) - Each chunk carries absolute start_time so timestamps remain accurate across variable-length chunks — eliminates chunk_index*30 drift from overlap - Both-mode silence detection combined across loopback+mic (cut only when both quiet) - Pipeline and MarkdownWriter updated to consume (chunk_index, start_time, audio) tuples - First transcription text now appears in ~10s vs ~42s with the old fixed 30s chunks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Settings window additions: - Hotkey: configurable global hotkey (default ctrl+alt+r) with live capture — click Capture then press any modifier+key combo - Beep notifications: three independent checkboxes for recording start, stop, and MD file save completion (winsound.Beep) - Clipboard: optional checkbox to copy full transcript body (no timestamps, no header/footer) to clipboard after MD save App wiring: - keyboard.add_hotkey registers/re-registers on startup, wizard complete, and settings save; unregistered on quit - Hotkey callback dispatches to main thread via safe_after so tkinter state is never touched from the keyboard thread - Beeps run in daemon threads to avoid blocking recording teardown - Clipboard extraction reads the finalized MD body between header and --- footer marker; written to tk clipboard on main thread Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ct2-transformers-converter needs torch to load and convert HuggingFace Whisper models. torch is a one-time conversion dependency only — GPU inference continues to use ctranslate2 directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…dd transformers dep - settings_window: translate beep/clipboard checkbox labels to English - app: translate CUDA error dialog title, body, and buttons to English - settings_window: fix NameError in _download_bg — capture str(exc) into lambda default arg before it goes out of scope (Python 3.12+ behavior) - requirements.txt: transformers was already listed; verified present Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

hoiyada7-maker and others added 12 commits June 1, 2026 12:57

chore: add nvidia-cublas-cu12 and nvidia-cuda-runtime-cu12 to require…

4134d22

…ments Allows pip install -r requirements.txt to pull CUDA runtime DLLs automatically, enabling GPU inference without a full CUDA Toolkit install. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: GPU support, hotkey, beep notifications, and clipboard copy#8

feat: GPU support, hotkey, beep notifications, and clipboard copy#8
hoiyada7-maker wants to merge 12 commits into
parkscloud:masterfrom
hoiyada7-maker:feature/gpu-detection-fix

hoiyada7-maker commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hoiyada7-maker commented Jun 1, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant