🔊 Audio-Diarization

Speaker Diarization is a system that identifies and separates speakers in an audio file, with their respective timestamps and speaker labels. This model is integrated with a Automatic Speech Recognition (ASR) model to transcribe each speaker's audio. This model efficiently inputs a folder with audio files and presents a csv file containing speaker separation with their corresponding time stamps and audio transcription.

This process aids child rescue efforts by distinguishing victim and abuser voices, providing crucial evidence for court proceedings and in distinguishing speakers from background noise during criminal investigations

Installation

Clone the Repository:

git  clone  https://github.com/UMass-Rescue/Audio-Diarization.git

cd  Audio-Diarization

Install Dependencies:

For the best results create a virtaul environment. You can use any method to create a virtual environment!

One of the ways to create a virtual environment is listed below
```
python -m venv <virtual_env_name>
```
Activate the virtual environment

For MacOS/Linux run
```
source <virtual_env_name>/bin/activate
```
For Windows run
```
cd <virtual_env_name>\Scripts
.\activate
```
Install the required Python packages using the following command:
```
pip  install  -r  requirements.txt
```
Make sure to install ffmpeg on your system if you don't already have it

For MacOS

If you already have homebrew you can use the command listed below to directly install ffmpeg. If not you can follow the documentation to install homebrew and then use the command listed below.
```
brew  install  ffmpeg
```
For Windows

Use this link to install the ffmpeg executable. Click on the windows icon and use the windows build from gyan.dev

Follow the installation instructions mentioned in the installer Add ffmpeg to the environment variables to make to accessible globally
**Access the model **

This step is not longer needed! Unless you want to run the model API directly. By default this application is set to run the local model pipeline for pyannote/speaker-diarization-3.0
```
huggingface-cli  login
```
You will be prompted to enter the access token which you can find: https://huggingface.co/settings/tokens
Running the Flask-ML Server

Start the Flask-ML server to work with RescueBox for Audio Diarization:
```
python  model_3endpoints.py
```
The server will start running on 127.0.0.1 5000
Download and run RescueBox Desktop from the following link: Rescue Box Desktop

Open the RescueBox Desktop application and register the model

🔊 Audio-Diarization

Speaker Diarization is a system that identifies and separates speakers in an audio file, with their respective timestamps and speaker labels. This model is integrated with a Automatic Speech Recognition (ASR) model to transcribe each speaker's audio. This model efficiently inputs a folder with audio files and presents a csv file containing speaker separation with their corresponding time stamps and audio transcription.

This process aids child rescue efforts by distinguishing victim and abuser voices, providing crucial evidence for court proceedings and in distinguishing speakers from background noise during criminal investigations

Installation

Clone the Repository:

git  clone  https://github.com/UMass-Rescue/Audio-Diarization.git

cd  Audio-Diarization

Install Dependencies:

For the best results create a virtaul environment. You can use any method to create a virtual environment!

One of the ways to create a virtual environment is listed below
```
python -m venv <virtual_env_name>
```
Activate the virtual environment

For MacOS/Linux run
```
source <virtual_env_name>/bin/activate
```
For Windows run
```
cd <virtual_env_name>\Scripts
.\activate
```
Install the required Python packages using the following command:
```
pip  install  -r  requirements.txt
```
Make sure to install ffmpeg on your system if you don't already have it

For MacOS

If you already have homebrew you can use the command listed below to directly install ffmpeg. If not you can follow the documentation to install homebrew and then use the command listed below.
```
brew  install  ffmpeg
```
For Windows

Use this link to install the ffmpeg executable. Click on the windows icon and use the windows build from gyan.dev

Follow the installation instructions mentioned in the installer Add ffmpeg to the environment variables to make to accessible globally
**Access the model **

This step is not longer needed! Unless you want to run the model API directly. By default this application is set to run the local model pipeline for pyannote/speaker-diarization-3.0
```
huggingface-cli  login
```
You will be prompted to enter the access token which you can find: https://huggingface.co/settings/tokens
Running the Flask-ML Server

Start the Flask-ML server to work with RescueBox for Audio Diarization:
```
python  model_3endpoints.py
```
The server will start running on 127.0.0.1 5000
Download and run RescueBox Desktop from the following link: Rescue Box Desktop

Open the RescueBox Desktop application and register the model

On the left hand side you can see three option - Speaker Diarization, Audio Transcription and Speaker Diarization + Audio Transcription. You can select one of the end points based on your preference Speaker Diarization runs only the speaker seperation with their timestamps, Audio Transcription does only the audio transcription. Speaker Diarization + Audio Transcription does the primary task of seperating speakers in an audio file with their time stamp and their transcribed audio

Set the Input and Output directory. The input directory should have an audio file and an output directory where the csv file with the speaker seperation, time stamps and audio transcription is found. Once this is done you can click on the 'Run Model' at the bottom

Click the 'view' button to see the results. Results will be displayed as follows in the results section

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
LICENSES		LICENSES
evaluations		evaluations
input		input
models		models
output		output
README.md		README.md
app-info.md		app-info.md
model_3endpoints.py		model_3endpoints.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔊 Audio-Diarization

Installation

For MacOS/Linux run

For Windows run

For MacOS

For Windows

🔊 Audio-Diarization

Installation

For MacOS/Linux run

For Windows run

For MacOS

For Windows

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

UMass-Rescue/Audio-Diarization

Folders and files

Latest commit

History

Repository files navigation

🔊 Audio-Diarization

Installation

For MacOS/Linux run

For Windows run

For MacOS

For Windows

🔊 Audio-Diarization

Installation

For MacOS/Linux run

For Windows run

For MacOS

For Windows

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages