Skip to content

niranjanakella/Orator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

7 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Orator TTS Engine

Kokoro TTS Python PyTorch PyTorch macOS

High-quality neural text-to-speech with 50+ voices across 9 languages

Orator Demo

Click for Demo โ˜๏ธ

โœจ Features

  • ๐ŸŒ 8 Languages: American English, British English, Spanish, French, Hindi, Italian, Portuguese, Japanese, Chinese
  • ๐ŸŽญ 50+ Voices: Male and female voices with unique personalities
  • โšก Lightning Fast: GPU-accelerated inference with streaming audio
  • ๐ŸŽฏ macOS Hotkey: Double-tap Option key โŒฅ for instant TTS anywhere
  • ๐Ÿ”Š High Quality: Super high quality neural audio synthesis
  • ๐Ÿš€ Easy Setup: Installation through UV package manager
  • ๐Ÿ“ฑ System-Wide: Works with any macOS application

๐Ÿ—ฃ๏ธ Available Voices & Languages

Voices/Languages Available

๐Ÿ‡บ๐Ÿ‡ธ American English (a)

Female Voices:

  • af_alloy
  • af_aoede
  • af_bella
  • af_heart
  • af_jessica
  • af_kore
  • af_nicole
  • af_nova
  • af_river
  • af_sarah
  • af_sky

Male Voices:

  • am_adam
  • am_echo
  • am_eric
  • am_fenrir
  • am_liam
  • am_michael
  • am_onyx
  • am_puck
  • am_santa

๐Ÿ‡ฌ๐Ÿ‡ง British English (b)

Female Voices:

  • bf_alice
  • bf_emma
  • bf_isabella
  • bf_lily

Male Voices:

  • bm_daniel
  • bm_fable
  • bm_george
  • bm_lewis

๐Ÿ‡ช๐Ÿ‡ธ Spanish (e)

Female Voices:

  • ef_dora

Male Voices:

  • em_alex
  • em_santa

๐Ÿ‡ซ๐Ÿ‡ท French (f)

Female Voices:

  • ff_siwis

๐Ÿ‡ฎ๐Ÿ‡ณ Hindi (h)

Female Voices:

  • hf_alpha
  • hf_beta

Male Voices:

  • hm_omega
  • hm_psi

๐Ÿ‡ฎ๐Ÿ‡น Italian (i)

Female Voices:

  • if_sara

Male Voices:

  • im_nicola

๐Ÿ‡ฏ๐Ÿ‡ต Japanese (j)

Female Voices:

  • jf_alpha
  • jf_gongitsune
  • jf_nezumi
  • jf_tebukuro

Male Voices:

  • jm_kumo

๐Ÿ‡ง๐Ÿ‡ท Portuguese (p)

Female Voices:

  • pf_dora

Male Voices:

  • pm_alex
  • pm_santa

๐Ÿ‡จ๐Ÿ‡ณ Chinese (z)

Female Voices:

  • zf_xiaobei
  • zf_xiaoni
  • zf_xiaoxiao
  • zf_xiaoyi

Male Voices:

  • zm_yunjian
  • zm_yunxi
  • zm_yunxia
  • zm_yunyang

๐Ÿš€ Quick Start

Why UV? The Future of Python Package Management

We recommend UV for this project because it's:

  • โšก 10-100x faster than pip
  • ๐Ÿ”’ More secure with built-in dependency resolution
  • ๐ŸŽฏ Zero configuration - works out of the box
  • ๐Ÿ”„ Drop-in replacement for pip/pipenv/poetry
  • ๐ŸŒŸ Industry standard - used by major Python projects

Installation

  1. Install UV (if you don't have it):
  • Assumed that python is already installed on your system.
    pip install uv 
  1. Clone and setup the project:

    # Clone repo
    cd Orator
    
    # Create virtual environment and install dependencies
    uv venv --python=3.11
    source .venv/bin/activate  # On macOS/Linux
    uv pip install -r requirements.txt
  2. Install espeak-ng (required for phonemization):

    # macOS
    brew install espeak-ng
    
    # Verify installation
    espeak-ng --version
    
    #eSpeak NG text-to-speech: 1.51  Data at: /opt/homebrew/Cellar/espeak-ng/1.51/share/espeak-ng-data
  3. Download model and voices (if not included):

    uv pip install -U "huggingface_hub[cli]"
    
    # Download model
    huggingface-cli download hexgrad/Kokoro-82M --include "kokoro-v1_0.pth" --local-dir .
    
    # Download voices
    huggingface-cli download hexgrad/Kokoro-82M --include "voices/*" --local-dir .
  4. Language Pack

  • By default "en-core-web-sm" is installed through requirements for English, navigate and install other small language packs from spaCy.

๐ŸŽฏ Usage

1. macOS Hotkey Application

Grant Accessibility Permissions First:

  1. Open System Preferences โ†’ Security & Privacy โ†’ Privacy
  2. Select "Accessibility" from the left panel
  3. Click the lock icon and enter your password
  4. Add your terminal application (Terminal.app, iTerm2, etc.)
  5. Ensure it's checked/enabled

Run the hotkey application:

# Make sure your are inside the virtual environment
python3 macos_tts_hotkey.py

How to use:

  • Select any text in any macOS application
  • Double-tap the Option key (โŒฅ) quickly to start TTS
  • Press Escape key to stop TTS playback at any time
  • Listen to the text being read aloud!

โš™๏ธ Configuration

Hotkey Application Config

Edit config_hotkey.json:

{
    "model_path": "kokoro-v1_0.pth",
    "voices_dir": "voices",
    "voice": "af_bella",
    "speed": 1.0,
    "device": "auto"
}

Voice Selection

Choose voices by language prefix:

  • af_* / am_* - American English
  • bf_* / bm_* - British English
  • ef_* / em_* - Spanish
  • ff_* - French
  • hf_* / hm_* - Hindi
  • if_* / im_* - Italian
  • jf_* / jm_* - Japanese
  • pf_* / pm_* - Portuguese
  • zf_* / zm_* - Chinese

๐Ÿ”ง Advanced Usage

Multi-language Support

# Create pipelines for different languages
en_pipeline = KPipeline(lang_code='a', model=model)  # American English
es_pipeline = KPipeline(lang_code='e', model=model)  # Spanish
ja_pipeline = KPipeline(lang_code='j', model=model)  # Japanese

# Use appropriate pipeline for each language
english_audio = list(en_pipeline("Hello world!", voice="af_bella"))[0].audio
spanish_audio = list(es_pipeline("ยกHola mundo!", voice="ef_dora"))[0].audio
japanese_audio = list(ja_pipeline("ใ“ใ‚“ใซใกใฏไธ–็•Œ๏ผ", voice="jf_alpha"))[0].audio

๐Ÿ› ๏ธ Troubleshooting

Common Issues

"Failed to start keyboard monitoring"

  • Grant Accessibility permissions in System Preferences
  • Restart the application after granting permissions

"espeak-ng not found"

# Install espeak-ng
brew install espeak-ng

# Verify installation
which espeak-ng

"Model file not found"

  • Ensure kokoro-v1_0.pth is in the project root
  • Check file permissions and path

"CUDA out of memory"

# Use CPU instead
config.device = "cpu"

# Or reduce batch size for long texts

"Voice file not found"

  • Ensure voice files are in the voices/ directory
  • Check that the voice name matches exactly (case-sensitive)

"Stop functionality not working"

  • Ensure the application has focus or accessibility permissions
  • Try pressing Escape key while TTS is actively playing
  • Check terminal logs for any error messages

Performance Optimization

  • GPU Usage: Automatic CUDA detection, falls back to CPU
  • Memory Management: Automatic cleanup after generation
  • Streaming: Use generate_audio_stream() for long texts
  • Caching: Voice packs are cached after first load

๐Ÿ“ Project Structure

Orator/
โ”œโ”€โ”€ kokoro/               # Core TTS library
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ model.py          # KModel implementation
โ”‚   โ”œโ”€โ”€ pipeline.py       # KPipeline
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ voices/               # Voice pack files (.pt)
โ”‚   โ”œโ”€โ”€ af_bella.pt
โ”‚   โ”œโ”€โ”€ am_adam.pt
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ macos_tts_hotkey.py   # macOS hotkey application
โ”œโ”€โ”€ kokoro-v1_0.pth       # Main TTS model
โ”œโ”€โ”€ requirements.txt      # Python dependencies
โ””โ”€โ”€ README.md             # This file

๐Ÿค Contributing

We welcome contributions! Please feel free to:

  • Report bugs and issues
  • Suggest new features
  • Submit pull requests
  • Add new voice packs
  • Improve documentation

๐Ÿ—บ๏ธ Roadmap

  • Streaming Audio chunks for Long Formers (Controlled low latency)
  • Speed Controls for Audio Stream
  • LLM driven Agentic AI Capabilities
  • Native MacOS application/interface for UI driven audio controlls
  • UI voice swap controlls

๐Ÿค Get Involved

Want to help shape the future of Kokoro TTS? Here's how:

  • ๐Ÿ› Report Issues - Help us identify bugs and improvements
  • ๐Ÿ’ก Suggest Features - Share your ideas for new functionality
  • ๐Ÿ”ง Contribute Code - Submit PRs for features or fixes
  • ๐ŸŽจ Design UI/UX - Help design the native app interface
  • ๐Ÿ“ Write Documentation - Improve guides and tutorials
  • ๐Ÿ—ฃ๏ธ Add Voices - Contribute new voice packs and languages

๐Ÿ™ Acknowledgments

  • Built on the amazing Kokoro TTS model
  • Powered by PyTorch and modern neural architectures
  • Inspired by the need for accessible, high-quality TTS

Made with โค๏ธ for the open-source community

LinkedIn

About

0rator an open-source TTS engine built for hotkey driven text-to-speech for Apple MacOS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages