OpenWakeWord Trainer

Train custom wake word models for OpenWakeWord using synthetic voices from Kokoro TTS combined with your real voice recordings.

Why this exists: The official OpenWakeWord training process relies on Google Colab notebooks that frequently break. This repo provides a working local training pipeline that produces quality models.

What You Get

A trained .onnx wake word model (~400KB)
Works with OpenWakeWord, Home Assistant, or any system that supports ONNX models
Typical results: 70%+ accuracy, <2 false positives per hour

Requirements

NVIDIA GPU with CUDA (RTX 3060 12GB or better recommended)
Docker with NVIDIA Container Toolkit
~20GB disk space for training data

Quick Start (Docker)

Docker is the recommended approach - it handles all the dependency hell for you.

1. Clone

git clone https://github.com/CoreWorxLab/openwakeword-training.git
cd openwakeword-training

2. Download Training Data (~17GB, one-time)

docker compose build trainer
docker compose run --rm trainer ./setup-data.sh

3. Record Your Voice (Optional but Recommended)

Recording 20-50 samples of your actual voice significantly improves detection. This runs on your host machine (needs microphone access):

pip install pyaudio numpy scipy
python record_samples.py --wake-word "hey cal"

Press ENTER to start each 2-second recording
Say your wake word naturally
Vary your tone, speed, and distance from the mic
Press 'q' to quit

4. Train Your Model

docker compose run --rm trainer python train.py --wake-word "hey cal" --data-dir /app/data

Training takes 4-8 hours depending on GPU.

5. Test Your Model

Test on your host machine (needs microphone access):

pip install openwakeword pyaudio numpy
python test_model.py --model my_custom_model/hey_cal.onnx

Speak your wake word into the microphone and watch for detections.

Configuration

Parameter	Default	Description
`--wake-word`	"hey cal"	The wake word/phrase to detect
`--samples-per-voice`	200	Samples generated per Kokoro voice
`--training-steps`	50000	More steps = better but slower
`--layer-size`	64	Network size (32, 64, or 128)
`--kokoro-url`	http://localhost:8880	Kokoro TTS endpoint
`--data-dir`	`.`	Training data directory (`/app/data` for Docker)

How It Works

Sample Generation - Creates ~13K positive samples using 67 Kokoro voices with speed variation (0.7-1.3x), plus your real recordings (weighted 3x)
Negative Samples - Generates samples of clearly different phrases ("hello", "hey siri", "alexa") to teach the model what NOT to detect
Augmentation - OpenWakeWord adds noise, reverb, and mixing to simulate real-world conditions
Training - Neural network learns to distinguish your wake word from everything else

Key Insight

Don't use similar-sounding negatives. Training on phrases like "hey call" or "hey carl" actually hurts performance. Use only clearly different phrases like "hello", "hey siri", "alexa".

Output

my_custom_model/
├── hey_cal.onnx          # Your trained model - use this!
└── hey_cal/
    ├── positive_train/   # Generated training samples
    ├── positive_test/    # Test samples
    ├── negative_train/   # Negative training samples
    └── negative_test/    # Negative test samples

Using Your Model

from openwakeword.model import Model

model = Model(wakeword_models=["my_custom_model/hey_cal.onnx"])

# Process 16kHz mono audio frames
prediction = model.predict(audio_frame)
if prediction["hey_cal"] > 0.5:
    print("Wake word detected!")

Manual Setup (No Docker)

If you prefer not to use Docker, you can set up the environment directly:

./setup.sh
source venv/bin/activate

# Start Kokoro TTS separately
docker run -d --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:latest

python train.py --wake-word "hey cal"

Note: This requires Python 3.10+ and working CUDA. The pinned dependency versions in requirements.txt can conflict with other Python packages on your system, which is why Docker is recommended.

Troubleshooting

"Reached EOF prematurely" warnings

Normal - Kokoro's WAV headers have a quirk but the audio data is fine.

Low recall in training metrics

Training metrics use synthetic test samples. Real-world performance is usually better.

Model not detecting wake word

Ensure audio is 16kHz mono
Model needs ~2 seconds of audio buffer to warm up
Try lowering detection threshold (default 0.5)

TFLite conversion error at end

Ignore - the ONNX model is saved successfully before this error.

Credits

OpenWakeWord by David Scripka
Kokoro TTS for synthetic voice generation
Training data from ACAV100M

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
record_samples.py		record_samples.py
requirements.txt		requirements.txt
setup-data.sh		setup-data.sh
setup.sh		setup.sh
test_model.py		test_model.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenWakeWord Trainer

What You Get

Requirements

Quick Start (Docker)

1. Clone

2. Download Training Data (~17GB, one-time)

3. Record Your Voice (Optional but Recommended)

4. Train Your Model

5. Test Your Model

Configuration

How It Works

Key Insight

Output

Using Your Model

Manual Setup (No Docker)

Troubleshooting

"Reached EOF prematurely" warnings

Low recall in training metrics

Model not detecting wake word

TFLite conversion error at end

Credits

License

About

Uh oh!

Releases

Packages

Contributors 2

Languages

CoreWorxLab/openwakeword-training

Folders and files

Latest commit

History

Repository files navigation

OpenWakeWord Trainer

What You Get

Requirements

Quick Start (Docker)

1. Clone

2. Download Training Data (~17GB, one-time)

3. Record Your Voice (Optional but Recommended)

4. Train Your Model

5. Test Your Model

Configuration

How It Works

Key Insight

Output

Using Your Model

Manual Setup (No Docker)

Troubleshooting

"Reached EOF prematurely" warnings

Low recall in training metrics

Model not detecting wake word

TFLite conversion error at end

Credits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages