Summary
AudioRecorder::stop_recording() claims to return 16kHz mono f32 samples, but it never actually resamples the audio. On macOS (especially Apple Silicon), the mic usually records at 44.1kHz or 48kHz. These raw samples are returned directly, while Whisper strictly expects 16kHz input.
Root Cause
In start_recording(), the device sample rate is read from cpal but never stored in the struct. In stop_recording(), there’s only a TODO comment about resampling — the function just returns the collected samples as-is.
So if the mic is running at 48kHz, Whisper receives audio at 3× the expected rate, which leads to truncated, incorrect, or completely garbled transcriptions.
Proposed Fix
The resample_to_16khz() helper already exists in the same file.
Minimal fix:
- Add a
sample_rate: u32 field to AudioRecorder
- Store
config.sample_rate().0 in start_recording()
- Call
resample_to_16khz(&samples, self.sample_rate) inside stop_recording() before returning
This change is fully self-contained and only touches this file.
Steps to Reproduce
- Record audio on a Mac (default mic = 44.1kHz / 48kHz)
- Run Whisper transcription
- Notice garbled or incorrect output