Skip to content

Conversation

@joshmux
Copy link
Contributor

@joshmux joshmux commented Dec 8, 2025

This PR enables support for Voice Activity Detection (VAD) in the Go bindings for whisper.cpp. It adds parameters and setter methods in the Go API so that users can enable and configure VAD when using whisper through Go.

Specifically:
Introduces new Params fields for VAD configuration (e.g. vad, vad_model_path, vad_threshold, vad_min_speech_ms, vad_min_silence_ms, vad_max_speech_sec, vad_speech_pad_ms, vad_samples_overlap).
GitHub

Exposes corresponding SetVAD, SetVADModelPath, SetVADThreshold, etc., methods on the high-level Go Context interface for easy use in Go code.
GitHub

Maintains full backward compatibility: existing Go code that does not call these new methods should continue working as before, without any change.

With this change, Go users of whisper.cpp can now leverage VAD to automatically skip silence / non-speech regions, which can reduce processing time and improve the quality of segment boundaries (for example avoiding segmentation in silence).

@ggerganov ggerganov merged commit 9f5ed26 into ggml-org:master Dec 10, 2025
65 of 67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants