Local TTS Chrome Extension

A Chrome extension that uses your local TTS API Docker container to read web articles aloud with high-quality text-to-speech in 60+ languages.

🎯 Project Overview

Local TTS is a privacy-focused Chrome extension that brings high-quality text-to-speech capabilities directly to your browser. Unlike cloud-based TTS services, this extension runs entirely on your local machine using Docker containers, ensuring your data never leaves your device.

✨ Key Features

🎵 High-quality TTS: Uses Local TTS model for natural-sounding speech
🌍 60+ Languages: Supports a wide range of languages and accents
📖 Smart Article Detection: Automatically extracts readable content from web pages
🎛️ Customizable Settings: Adjust voice, speed, and API endpoint
🎧 Audio Format Options: Choose between PCM (recommended), MP3, or WAV formats
⏯️ Playback Controls: Play, pause, and stop functionality
🎨 Modern UI: Beautiful gradient interface with visual indicators
🖱️ Context Menu: Right-click to read selected text or full articles
🔒 Privacy-First: All processing happens locally on your machine
🚀 Easy Setup: One-command Docker deployment

🛠️ Tech Stack

Frontend: Chrome Extension (Manifest V3), HTML5, CSS3, JavaScript (ES6+)
Backend: Local TTS API (Docker container)
Build Tools: esbuild, npm
Containerization: Docker, Docker Compose
Languages: JavaScript, HTML, CSS, YAML

🚀 Quick Start

Prerequisites

Docker (for running the TTS service)
Chrome Browser (Version 88+ for Manifest V3 support)
Git (for cloning the repository)

Installation

Clone the repository

git clone https://github.com/yourusername/local-tts.git
cd local-tts

Start the TTS service

# Pull the Docker images first
docker pull ghcr.io/remsky/kokoro-fastapi-cpu:latest
docker pull ghcr.io/remsky/kokoro-fastapi-gpu:latest

# CPU version (recommended for most users)
docker run -d --name local-tts-cpu -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-cpu:latest

# GPU version (requires NVIDIA GPU)
docker run -d --name local-tts-gpu --gpus all -p 8880:8880 ghcr.io/remsky/kokoro-fastapi-gpu:latest

# Or use the interactive setup script
./setup.sh

Load the extension in Chrome
- Open Chrome and go to chrome://extensions/
- Enable "Developer mode" (top right toggle)
- Click "Load unpacked"
- Select the local-tts folder
- The extension should now appear in your extensions list
Configure the extension
- Click the extension icon
- Set the API URL to: http://localhost:8880
- Choose your preferred voice and settings

🛠️ Local Development

Development Setup

Clone and install dependencies

git clone https://github.com/yourusername/local-tts.git
cd local-tts
npm install

Start development environment

# Start the TTS service
make dev

# Or manually:
docker-compose -f docker-compose.dev.yml up -d

Build the extension

# Build once
npm run build

# Watch mode for development
npm run build:watch

Load in Chrome for testing
- Go to chrome://extensions/
- Enable "Developer mode"
- Click "Load unpacked"
- Select the project folder
- Click the reload button after making changes

Development Workflow

Make your changes in the src/ directory
Build the extension with npm run build or npm run build:watch
Reload the extension in Chrome's extension page
Test your changes on any webpage
Commit and push your changes

Project Structure

local-tts/
├── src/                    # Source code
│   ├── main.js            # Main content script
│   ├── audio-handler.js   # Audio processing
│   ├── text-extraction.js # Text extraction logic
│   ├── highlighting.js    # Text highlighting
│   └── ui-controller.js   # UI management
├── popup.html             # Extension popup UI
├── popup.js               # Popup functionality
├── background.js          # Background service worker
├── content.css            # Content script styles
├── manifest.json          # Extension manifest
├── docker-compose.yml     # Docker configuration
├── setup.sh              # Setup script
└── README.md             # This file

Available Scripts

# Build the extension
npm run build

# Watch mode for development
npm run build:watch

# Start development environment
make dev

# Run tests
make test

# Clean up
make clean

🤝 Contributing

We welcome contributions! Here's how you can help:

Ways to Contribute

🐛 Report bugs - Use GitHub Issues
💡 Suggest features - Open a feature request
🔧 Fix bugs - Submit a pull request
📚 Improve documentation - Help make the docs better
🌍 Add language support - Help with translations
🎨 UI/UX improvements - Enhance the user experience

Contribution Guidelines

Fork the repository

Create a feature branch

git checkout -b feature/your-feature-name

Make your changes
Test thoroughly
```
npm run build
make test
```

Commit with clear messages

git commit -m "feat: add new audio format support"

Push to your fork

git push origin feature/your-feature-name

Open a Pull Request

Code Style

Use clear, descriptive commit messages
Follow existing code style and formatting
Add comments for complex logic
Test your changes before submitting
Update documentation if needed

Testing Your Changes

Build the extension
```
npm run build
```
Load in Chrome
- Go to chrome://extensions/
- Click reload on the extension
- Test on various websites
Run API tests
```
make test
```

📋 Issue Templates

When reporting issues, please include:

Browser version and OS
Extension version
Steps to reproduce
Expected vs actual behavior
Screenshots (if applicable)
Console logs (if errors occur)

🏗️ Architecture

Extension Components

Content Script (src/main.js): Injected into web pages to extract text and handle user interactions
Popup (popup.html/js): Extension UI for settings and controls
Background Script (background.js): Service worker for background tasks
Audio Handler (src/audio-handler.js): Manages TTS API communication and audio playback
Text Extractor (src/text-extraction.js): Extracts readable content from web pages
UI Controller (src/ui-controller.js): Manages visual indicators and UI state

Data Flow

User clicks "Read" in popup or context menu
Content script extracts text from the page
Text is sent to local TTS API via audio handler
Audio is streamed back and played through Web Audio API
UI updates to show progress and current text position

🔧 Configuration

Environment Variables

The TTS service can be configured with these environment variables:

environment:
  - HOST=0.0.0.0
  - PORT=8880
  - WORKERS=1
  - MODEL_PATH=/app/models
  - CACHE_DIR=/app/cache
  - LOG_LEVEL=INFO
  - VOICE_CACHE_SIZE=100
  - AUDIO_CACHE_SIZE=100
  - MAX_TEXT_LENGTH=5000

Extension Settings

API URL: Your local TTS service endpoint
Voice: Choose from 60+ available voices
Speed: Adjust playback speed (0.5x to 2.0x)
Audio Format: PCM, MP3, or WAV
Chunk Size: Text chunk size for processing
Auto-play: Automatically play next chunk
Highlighting: Highlight current text being read

🐛 Troubleshooting

Common Issues

"Cannot connect to API"
- Ensure Docker container is running
- Check if port 8880 is accessible
- Verify API URL in extension settings
"No readable content found"
- Try selecting specific text
- Check if page has detectable content
- Some JavaScript-heavy sites may not work
Audio not playing
- Check browser audio settings
- Ensure user interaction occurred
- Try refreshing the page

Debug Mode

Enable debug logging by setting LOG_LEVEL=DEBUG in your Docker environment:

docker run -d --name local-tts-cpu -p 8880:8880 \
  -e LOG_LEVEL=DEBUG \
  ghcr.io/remsky/kokoro-fastapi-cpu:latest

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Built with Local TTS API
Chrome Extension Manifest V3
Web Audio API for audio processing
Docker for containerization

📞 Support

📧 Email: [your-email@example.com]
🐛 Issues: GitHub Issues
📖 Documentation: Wiki
💬 Discussions: GitHub Discussions

🚀 Roadmap

Made with ❤️ by the Local TTS community

For security issues, see SECURITY.md. For contribution guidelines, see CONTRIBUTING.md. For community standards, see CODE_OF_CONDUCT.md.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
icons		icons
node_modules		node_modules
src		src
.dockerignore		.dockerignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
background.js		background.js
build.js		build.js
content.css		content.css
content.js		content.js
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.yml		docker-compose.yml
manifest.json		manifest.json
package-lock.json		package-lock.json
package.json		package.json
popup.html		popup.html
popup.js		popup.js
setup.sh		setup.sh
test_dropcap.html		test_dropcap.html
test_enhanced_ui.html		test_enhanced_ui.html
test_iframe.html		test_iframe.html
test_iframe_simple.html		test_iframe_simple.html

License

jeeveshlodhi/local_tts

Folders and files

Latest commit

History

Repository files navigation

Local TTS Chrome Extension

🎯 Project Overview

✨ Key Features

🛠️ Tech Stack

🚀 Quick Start

Prerequisites

Installation

🛠️ Local Development

Development Setup

Development Workflow

Project Structure

Available Scripts

🤝 Contributing

Ways to Contribute

Contribution Guidelines

Code Style

Testing Your Changes

📋 Issue Templates

🏗️ Architecture

Extension Components

Data Flow

🔧 Configuration

Environment Variables

Extension Settings

🐛 Troubleshooting

Common Issues

Debug Mode

📄 License

🙏 Acknowledgments

📞 Support

🚀 Roadmap

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages