🔥🔥🔥 Kokoro Rust

🔥🔥🔥 Kokoro Rust

Zonos Rust Is On The Way?

Spark-TTS On The Way?

Orpheus-TTS On The Way?

ASMR

video-1737110239209.webm

(typo in video, ignore it)

Digital Human

output2_added_subtitle.mp4

Give a star ⭐ if you like it!

Kokoro is a trending top 2 TTS model on huggingface. This repo provides insanely fast Kokoro infer in Rust, you can now have your built TTS engine powered by Kokoro and infer fast by only a command of koko.

kokoros is a rust crate that provides easy to use TTS ability. One can directly call koko in terminal to synthesize audio.

kokoros uses a relative small model 87M params, while results in extremly good quality voices results.

Languge support:

English;
Chinese (partly);
Japanese (partly);
German (partly);

🔥🔥🔥🔥🔥🔥🔥🔥🔥 Kokoros Rust version just got a lot attention now. If you also interested in insanely fast inference, embeded build, wasm support etc, please star this repo! We are keep updating it.

New Discord community: https://discord.gg/E566zfDWqD, Please join us if you interested in Rust Kokoro.

Updates

2025.01.22: 🔥🔥🔥 Streaming mode supported. You can now using --stream to have fun with stream mode, kudos to mroigo;
2025.01.17: 🔥🔥🔥 Style mixing supported! Now, listen the output AMSR effect by simply specific style: af_sky.4+af_nicole.5;
2025.01.15: OpenAI compatible server supported, openai format still under polish!
2025.01.15: Phonemizer supported! Now Kokoros can inference E2E without anyother dependencies! Kudos to @tstm;
2025.01.13: Espeak-ng tokenizer and phonemizer supported! Kudos to @mindreframer ;
2025.01.12: Released Kokoros;

Installation

Install required Python packages:

pip install -r scripts/requirements.txt

Initialize voice data:

python scripts/fetch_voices.py

This step fetches the required voices.json data file, which is necessary for voice synthesis.

Build the project:

cargo build --release

Usage

View available options

./target/release/koko -h

Generate speech for some text

./target/release/koko text "Hello, this is a TTS test"

The generated audio will be saved to tmp/output.wav by default. You can customize the save location with the --output or -o option:

./target/release/koko text "I hope you're having a great day today!" --output greeting.wav

Generate speech for each line in a file

./target/release/koko file poem.txt

For a file with 3 lines of text, by default, speech audio files tmp/output_0.wav, tmp/output_1.wav, tmp/output_2.wav will be outputted. You can customize the save location with the --output or -o option, using {line} as the line number:

./target/release/koko file lyrics.txt -o "song/lyric_{line}.wav"

OpenAI-Compatible Server

Start the server:

./target/release/koko openai

Make API requests using either curl or Python:

Using curl:

curl -X POST http://localhost:3000/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anything can go here",
    "input": "Hello, this is a test of the Kokoro TTS system!",
    "voice": "af_sky"
  }'
  --output sky-says-hello.wav

Using Python:

python scripts/run_openai.py

Streaming

The stream option will start the program, reading for lines of input from stdin and outputting WAV audio to stdout.

Use it in conjunction with piping.

Typing manually

./target/release/koko stream > live-audio.wav
# Start typing some text to generate speech for and hit enter to submit
# Speech will append to `live-audio.wav` as it is generated
# Hit Ctrl D to exit

Input from another source

echo "Suppose some other program was outputting lines of text" | ./target/release/koko stream > programmatic-audio.wav

With docker

Build the image

docker build -t kokoros .

Run the image, passing options as described above

# Basic text to speech
docker run -v ./tmp:/app/tmp kokoros text "Hello from docker!" -o tmp/hello.wav

# An OpenAI server (with appropriately bound port)
docker run -p 3000:3000 kokoros openai

Roadmap

Due to Kokoro actually not finalizing it's ability, this repo will keep tracking the status of Kokoro, and helpfully we can have language support incuding: English, Mandarin, Japanese, German, French etc.

Copyright

Copyright reserved by Lucas Jin under Apache License.

Name		Name	Last commit message	Last commit date
Latest commit History 149 Commits
checkpoints		checkpoints
data		data
koko		koko
kokoros-openai		kokoros-openai
kokoros		kokoros
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
README.md		README.md
download_all.sh		download_all.sh
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔥🔥🔥 Kokoro Rust

Zonos Rust Is On The Way?

Spark-TTS On The Way?

Orpheus-TTS On The Way?

Updates

Installation

Usage

View available options

Generate speech for some text

Generate speech for each line in a file

OpenAI-Compatible Server

Streaming

Typing manually

Input from another source

With docker

Roadmap

Copyright

About

Uh oh!

Releases

Packages

Languages

joubertb/Kokoros

Folders and files

Latest commit

History

Repository files navigation

🔥🔥🔥 Kokoro Rust

Zonos Rust Is On The Way?

Spark-TTS On The Way?

Orpheus-TTS On The Way?

Updates

Installation

Usage

View available options

Generate speech for some text

Generate speech for each line in a file

OpenAI-Compatible Server

Streaming

Typing manually

Input from another source

With docker

Roadmap

Copyright

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages