Skip to content

bug: AudioRecorder.stop_recording() never resamples to 16kHz — causes garbage Whisper transcriptions #3

@Bhup-GitHUB

Description

@Bhup-GitHUB

Summary

AudioRecorder::stop_recording() claims to return 16kHz mono f32 samples, but it never actually resamples the audio. On macOS (especially Apple Silicon), the mic usually records at 44.1kHz or 48kHz. These raw samples are returned directly, while Whisper strictly expects 16kHz input.

Root Cause

In start_recording(), the device sample rate is read from cpal but never stored in the struct. In stop_recording(), there’s only a TODO comment about resampling — the function just returns the collected samples as-is.

So if the mic is running at 48kHz, Whisper receives audio at 3× the expected rate, which leads to truncated, incorrect, or completely garbled transcriptions.

Proposed Fix

The resample_to_16khz() helper already exists in the same file.

Minimal fix:

  • Add a sample_rate: u32 field to AudioRecorder
  • Store config.sample_rate().0 in start_recording()
  • Call resample_to_16khz(&samples, self.sample_rate) inside stop_recording() before returning

This change is fully self-contained and only touches this file.

Steps to Reproduce

  1. Record audio on a Mac (default mic = 44.1kHz / 48kHz)
  2. Run Whisper transcription
  3. Notice garbled or incorrect output

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions