STT+SER+TTS Backend

STT = Speech-to-Text SER = Speech Emotion Recognition TTS = Text-to-Speech

Setup

Instructions are for Ubuntu-based systems.

Given that $PROJECT_ROOT is the path to this project's root directory.

Follow instructions in the Requirements section in this page: https://github.com/Uberi/speech_recognition?tab=readme-ov-file#requirements
Follow instructions here to download SER (Speech Emotion Recognition) model weights, into the directory $PROJECT_ROOT/model_weight.
Under the project root directory, copy the content of file .env.example into a new file named .env, then set the environment variables appropriately.

To setup Google auth environment please also see the Google Speech-to-Text API guide (google it!)

Run

cd $PROJECT_ROOT
virtualenv venv
source ./venv/bin/activate
pip install -r requirements.txt

cd $PROJECT_ROOT
source ./venv/bin/activate
fastapi run main.py --host 0.0.0.0 --port 7123

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
audio_processing		audio_processing
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt