Skip to content

VAD on singing voice? #13

@shoegazerstella

Description

@shoegazerstella

I am trying to adapt this script to detect voice-silence segments in an audio file containing source separated singing voice signal obtained from http://github.com/sigsep/open-unmix-pytorch

I have some questions:

  • Does it make sense to compute the threshold for each data_window independently? Instead of having a fixed speech_energy_threshold? I would do that by computing the energy of the data_window signal, normalizing it and taking its mean value. If this value is = 0.0, I can label that segment as silence.

  • Is there a clever way to choose parameters like sample_window, sample_overlap, speech_window that would be more appropriate for singing voice signals?

Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions