Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR enables support for Voice Activity Detection (VAD) in the Go bindings for whisper.cpp. It adds parameters and setter methods in the Go API so that users can enable and configure VAD when using whisper through Go.
Specifically:
Introduces new Params fields for VAD configuration (e.g. vad, vad_model_path, vad_threshold, vad_min_speech_ms, vad_min_silence_ms, vad_max_speech_sec, vad_speech_pad_ms, vad_samples_overlap).
GitHub
Exposes corresponding
SetVAD,SetVADModelPath,SetVADThreshold, etc., methods on the high-level Go Context interface for easy use in Go code.GitHub
Maintains full backward compatibility: existing Go code that does not call these new methods should continue working as before, without any change.
With this change, Go users of whisper.cpp can now leverage VAD to automatically skip silence / non-speech regions, which can reduce processing time and improve the quality of segment boundaries (for example avoiding segmentation in silence).