[BCI] QVAC-17071 feat: add BCI neural signal support (variable conv1 kernel + windowed attention) by sharmaraju352 · Pull Request #10 · tetherto/qvac-ext-lib-whisper.cpp

sharmaraju352 · 2026-04-16T10:22:35Z

Summary

Adds two changes to whisper.cpp to support brain-computer interface (BCI) neural signal transcription. Based on v1.8.4.1.

1. Variable conv1 kernel size

Reads n_audio_conv1_kernel from model hparams (defaults to 3 for standard whisper models)
Allows BCI models to use a different first convolution kernel size

2. Windowed self-attention for encoder layers

Adds n_audio_window_size and n_audio_last_window_layer hparams
When present, encoder self-attention is restricted to a local window for layers up to last_window_layer
When windowed attention is active, the encoder bypasses flash attention and uses the standard softmax path (Metal flash attention does not support custom F32 masks)
Flash attention remains enabled for non-BCI models and for the decoder
Adds proper SOS token (language + transcribe) initialization for BCI models

Backward compatibility

Both changes are backward-compatible:

n_audio_conv1_kernel defaults to 3 (standard whisper behavior)
n_audio_window_size defaults to 0 and n_audio_last_window_layer defaults to -1, which disables windowed attention entirely
Standard whisper models are completely unaffected

Context

Required by the new @qvac/bci-whispercpp addon: tetherto/qvac#1583

Test plan

Standard whisper transcription still works (no regression)
BCI model loads and transcribes neural signals correctly on v1.8.4.1
Verified locally: 10.4% average WER across 5 BCI test samples (identical to v1.7.6 baseline)

Test results

BCI package (`@qvac/bci-whispercpp`)

4/4 integration tests pass (9/9 assertions)
10.4% average WER across 5 neural signal samples — identical output to v1.7.6 baseline

Standard whisper package (`@qvac/transcription-whispercpp`)

Integration: 10/10 tests pass (40/40 assertions)
Unit: 13/13 tests pass (57/57 assertions)
No regression in standard audio transcription (s16le + f32le formats verified)

Read n_audio_conv1_kernel from model hparams to allow BCI models to use a non-standard first convolution kernel size. Standard whisper models default to kernel size 3. Made-with: Cursor

- Add n_audio_window_size and n_audio_last_window_layer hparams - When present, encoder self-attention is restricted to a local window for layers up to last_window_layer - Bypass flash attention when windowed mask is active (Metal FA does not support custom F32 masks); flash attention remains enabled for non-BCI models and for the decoder - Populate window_mask data on the encoder graph (not the cross graph) - Add proper SOS token (language + transcribe) initialization for BCI models Backward-compatible: n_audio_window_size defaults to 0 and n_audio_last_window_layer defaults to -1, disabling windowed attention entirely for standard whisper models. Made-with: Cursor

Made-with: Cursor

Address review feedback: 1. Guard read_safe for BCI-specific hparams (n_audio_conv1_kernel, n_audio_window_size, n_audio_last_window_layer) behind a n_mels > 256 check. Standard whisper models have n_mels <= 128 and do not contain these fields — reading them unconditionally would corrupt the file position and break model loading. 2. Add explicit is_bci flag to hparams struct, set when BCI fields are detected during loading. 3. Use is_bci flag (instead of n_audio_window_size > 0) to guard the BCI-specific decoder SOS token initialization. 4. Log BCI-specific hparams when a BCI model is detected. Made-with: Cursor

The windowed attention mask values depend only on n_ctx and window_size, both fixed after model load. Move the O(n_ctx^2) computation from whisper_encode_internal (called every encode) to whisper_init_state (called once). The encode path now just copies the precomputed data to the graph tensor. Made-with: Cursor

sharmaraju352 requested review from a team as code owners April 16, 2026 10:22

Raju added 5 commits April 16, 2026 19:57

feat(bci): add variable conv1 kernel size support

775d436

Read n_audio_conv1_kernel from model hparams to allow BCI models to use a non-standard first convolution kernel size. Standard whisper models default to kernel size 3. Made-with: Cursor

fix vcpkg build

e6fcbaa

Made-with: Cursor

fix apple silicon cross compile

461f07d

Made-with: Cursor

fix windows pthread

bbb3535

Made-with: Cursor

sharmaraju352 force-pushed the feat/bci-patches-v184 branch from 5326bf7 to bbb3535 Compare April 16, 2026 14:28

GustavoA1604 requested changes Apr 16, 2026

View reviewed changes

Comment thread src/whisper.cpp Outdated

Comment thread src/whisper.cpp Outdated

Comment thread src/whisper.cpp Outdated

Raju added 2 commits April 18, 2026 10:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BCI] QVAC-17071 feat: add BCI neural signal support (variable conv1 kernel + windowed attention)#10

[BCI] QVAC-17071 feat: add BCI neural signal support (variable conv1 kernel + windowed attention)#10
sharmaraju352 wants to merge 7 commits intomasterfrom
feat/bci-patches-v184

sharmaraju352 commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sharmaraju352 commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Variable conv1 kernel size

2. Windowed self-attention for encoder layers

Backward compatibility

Context

Test plan

Test results

BCI package (@qvac/bci-whispercpp)

Standard whisper package (@qvac/transcription-whispercpp)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sharmaraju352 commented Apr 16, 2026 •

edited

Loading

BCI package (`@qvac/bci-whispercpp`)

Standard whisper package (`@qvac/transcription-whispercpp`)