- ๐ 8 Languages: American English, British English, Spanish, French, Hindi, Italian, Portuguese, Japanese, Chinese
- ๐ญ 50+ Voices: Male and female voices with unique personalities
- โก Lightning Fast: GPU-accelerated inference with streaming audio
- ๐ฏ macOS Hotkey: Double-tap Option key โฅ for instant TTS anywhere
- ๐ High Quality: Super high quality neural audio synthesis
- ๐ Easy Setup: Installation through UV package manager
- ๐ฑ System-Wide: Works with any macOS application
Voices/Languages Available
Female Voices:
af_alloyaf_aoedeaf_bellaaf_heartaf_jessicaaf_koreaf_nicoleaf_novaaf_riveraf_sarahaf_sky
Male Voices:
am_adamam_echoam_ericam_fenriram_liamam_michaelam_onyxam_puckam_santa
Female Voices:
bf_alicebf_emmabf_isabellabf_lily
Male Voices:
bm_danielbm_fablebm_georgebm_lewis
Female Voices:
ef_dora
Male Voices:
em_alexem_santa
Female Voices:
ff_siwis
Female Voices:
hf_alphahf_beta
Male Voices:
hm_omegahm_psi
Female Voices:
if_sara
Male Voices:
im_nicola
Female Voices:
jf_alphajf_gongitsunejf_nezumijf_tebukuro
Male Voices:
jm_kumo
Female Voices:
pf_dora
Male Voices:
pm_alexpm_santa
Female Voices:
zf_xiaobeizf_xiaonizf_xiaoxiaozf_xiaoyi
Male Voices:
zm_yunjianzm_yunxizm_yunxiazm_yunyang
We recommend UV for this project because it's:
- โก 10-100x faster than pip
- ๐ More secure with built-in dependency resolution
- ๐ฏ Zero configuration - works out of the box
- ๐ Drop-in replacement for pip/pipenv/poetry
- ๐ Industry standard - used by major Python projects
- Install UV (if you don't have it):
- Assumed that python is already installed on your system.
pip install uv
-
Clone and setup the project:
# Clone repo cd Orator # Create virtual environment and install dependencies uv venv --python=3.11 source .venv/bin/activate # On macOS/Linux uv pip install -r requirements.txt
-
Install espeak-ng (required for phonemization):
# macOS brew install espeak-ng # Verify installation espeak-ng --version #eSpeak NG text-to-speech: 1.51 Data at: /opt/homebrew/Cellar/espeak-ng/1.51/share/espeak-ng-data
-
Download model and voices (if not included):
uv pip install -U "huggingface_hub[cli]" # Download model huggingface-cli download hexgrad/Kokoro-82M --include "kokoro-v1_0.pth" --local-dir . # Download voices huggingface-cli download hexgrad/Kokoro-82M --include "voices/*" --local-dir .
-
Language Pack
- By default "en-core-web-sm" is installed through requirements for English, navigate and install other small language packs from spaCy.
Grant Accessibility Permissions First:
- Open System Preferences โ Security & Privacy โ Privacy
- Select "Accessibility" from the left panel
- Click the lock icon and enter your password
- Add your terminal application (Terminal.app, iTerm2, etc.)
- Ensure it's checked/enabled
Run the hotkey application:
# Make sure your are inside the virtual environment
python3 macos_tts_hotkey.pyHow to use:
- Select any text in any macOS application
- Double-tap the Option key (โฅ) quickly to start TTS
- Press Escape key to stop TTS playback at any time
- Listen to the text being read aloud!
Edit config_hotkey.json:
{
"model_path": "kokoro-v1_0.pth",
"voices_dir": "voices",
"voice": "af_bella",
"speed": 1.0,
"device": "auto"
}Choose voices by language prefix:
af_*/am_*- American Englishbf_*/bm_*- British Englishef_*/em_*- Spanishff_*- Frenchhf_*/hm_*- Hindiif_*/im_*- Italianjf_*/jm_*- Japanesepf_*/pm_*- Portuguesezf_*/zm_*- Chinese
# Create pipelines for different languages
en_pipeline = KPipeline(lang_code='a', model=model) # American English
es_pipeline = KPipeline(lang_code='e', model=model) # Spanish
ja_pipeline = KPipeline(lang_code='j', model=model) # Japanese
# Use appropriate pipeline for each language
english_audio = list(en_pipeline("Hello world!", voice="af_bella"))[0].audio
spanish_audio = list(es_pipeline("ยกHola mundo!", voice="ef_dora"))[0].audio
japanese_audio = list(ja_pipeline("ใใใซใกใฏไธ็๏ผ", voice="jf_alpha"))[0].audio"Failed to start keyboard monitoring"
- Grant Accessibility permissions in System Preferences
- Restart the application after granting permissions
"espeak-ng not found"
# Install espeak-ng
brew install espeak-ng
# Verify installation
which espeak-ng"Model file not found"
- Ensure
kokoro-v1_0.pthis in the project root - Check file permissions and path
"CUDA out of memory"
# Use CPU instead
config.device = "cpu"
# Or reduce batch size for long texts"Voice file not found"
- Ensure voice files are in the
voices/directory - Check that the voice name matches exactly (case-sensitive)
"Stop functionality not working"
- Ensure the application has focus or accessibility permissions
- Try pressing Escape key while TTS is actively playing
- Check terminal logs for any error messages
- GPU Usage: Automatic CUDA detection, falls back to CPU
- Memory Management: Automatic cleanup after generation
- Streaming: Use
generate_audio_stream()for long texts - Caching: Voice packs are cached after first load
Orator/
โโโ kokoro/ # Core TTS library
โ โโโ __init__.py
โ โโโ model.py # KModel implementation
โ โโโ pipeline.py # KPipeline
โ โโโ ...
โโโ voices/ # Voice pack files (.pt)
โ โโโ af_bella.pt
โ โโโ am_adam.pt
โ โโโ ...
โโโ macos_tts_hotkey.py # macOS hotkey application
โโโ kokoro-v1_0.pth # Main TTS model
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
We welcome contributions! Please feel free to:
- Report bugs and issues
- Suggest new features
- Submit pull requests
- Add new voice packs
- Improve documentation
- Streaming Audio chunks for Long Formers (Controlled low latency)
- Speed Controls for Audio Stream
- LLM driven Agentic AI Capabilities
- Native MacOS application/interface for UI driven audio controlls
- UI voice swap controlls
Want to help shape the future of Kokoro TTS? Here's how:
- ๐ Report Issues - Help us identify bugs and improvements
- ๐ก Suggest Features - Share your ideas for new functionality
- ๐ง Contribute Code - Submit PRs for features or fixes
- ๐จ Design UI/UX - Help design the native app interface
- ๐ Write Documentation - Improve guides and tutorials
- ๐ฃ๏ธ Add Voices - Contribute new voice packs and languages
- Built on the amazing Kokoro TTS model
- Powered by PyTorch and modern neural architectures
- Inspired by the need for accessible, high-quality TTS

