Add Smart UI Auto-Detection, Dynamic Voice System, and .env Support by kiralpoon · Pull Request #37 · NVIDIA/personaplex

kiralpoon · 2026-01-27T09:31:43Z

This PR adds three quality-of-life improvements that simplify PersonaPlex setup and development workflow while maintaining full backward compatibility.

Features

1. Smart UI Auto-Detection

Problem: Developers had to manually specify --static client/dist flag to use custom UI builds.

Solution: Server automatically detects and serves custom UI from client/dist when available, falling back to default HuggingFace UI when not present.

Benefits:

Zero configuration - just build and run
Automatic fallback to default UI
Clear log messages showing which UI is being served
Optional manual override with --static flag

Logs example:

Found custom UI at .../client/dist, using it instead of default
static_path = /home/.../personaplex-blackwell/client/dist

2. Dynamic Custom Voice System

Problem: Adding custom voices required code changes and server restarts to make them available in the Web UI.

Solution: Automatic voice discovery system that scans voice directories and exposes all voices via /api/voices endpoint.

Benefits:

Drop voice files and restart - no code changes needed
Supports both HuggingFace cache and custom directories
CUSTOM_VOICE_DIR environment variable for custom locations
REST API for programmatic voice listing
Frontend automatically populates dropdown from API

Voice file support:

.pt files (voice embeddings) - appear in UI dropdown
.wav files (source audio) - used for generating embeddings

API endpoint:

curl http://localhost:8998/api/voices

Returns categorized voice list with metadata.

3. .env File Support

Problem: Users must export HF_TOKEN=... in every terminal session or configure huggingface-cli login globally.

Solution: Support for .env files using python-dotenv library.

Why .env is Better:

Feature	export	.env
Persistence	Lost on terminal close	Set once, works forever
Loading	Manual before each run	Automatic
Security	Appears in shell history	Never in command history
Project-specific	Global to session	Per-project isolation
Documentation	No standard way	.env.example template
Multiple vars	Export each separately	All in one file

Example comparison:

# Old way (export):
export HF_TOKEN=hf_xxxxx
export CUSTOM_VOICE_DIR=./my_voices
python -m moshi.server --ssl "$SSL_DIR"
# Must repeat every terminal session

# New way (.env):
cp .env.example .env
# Edit .env once, then just:
python -m moshi.server --ssl "$SSL_DIR"
# Works in every terminal forever

Changes

Backend:

moshi/moshi/server.py: UI auto-detection logic, voice discovery, .env loading
moshi/moshi/offline.py: .env loading, save-voice-embeddings support
moshi/moshi/voice_discovery.py: New VoiceDiscovery class for scanning voice files
moshi/pyproject.toml: Added python-dotenv dependency

Frontend:

client/src/hooks/useVoices.ts: New hook for fetching voices from API
client/src/components/ModelParams/ModelParams.tsx: Dynamic voice dropdown
client/src/pages/Queue/Queue.tsx: Integrated voice selection

Documentation:

.env.example: Template with HF_TOKEN and CUSTOM_VOICE_DIR
README.md: Updated with all three features
QUICKSTART.md: Fast setup guide for new users
FRONTEND_DEVELOPMENT.md: Frontend development workflow guide
TROUBLESHOOTING.md: Common issues and solutions
custom_voices/README.md: Custom voice creation guide

Backward Compatibility

✅ Fully backward compatible - no breaking changes:

All existing workflows continue to work
--static flag still works for manual override
export HF_TOKEN still works
huggingface-cli login still works
.env file is optional
Hardcoded voice prompts still work

Testing

UI Auto-Detection:

✅ With client/dist present - serves custom UI
✅ Without client/dist - falls back to HuggingFace UI
✅ Manual --static override works
✅ Log messages correctly indicate which UI is served

Voice System:

✅ Pre-packaged voices (NATF, NATM, VARF, VARM) load correctly
✅ Custom .pt voice files appear in dropdown
✅ CUSTOM_VOICE_DIR environment variable works
✅ /api/voices endpoint returns correct JSON
✅ Frontend dropdown populates automatically
✅ Voice selection persists across sessions

.env Support:

✅ Server starts successfully with .env file
✅ Server starts successfully without .env file (fallback)
✅ HF_TOKEN loads from .env
✅ CUSTOM_VOICE_DIR loads from .env
✅ No warnings when .env is missing
✅ No breaking changes to existing workflows

Statistics

16 files changed: 1,323 insertions, 47 deletions
New files: 6 (voice_discovery.py, useVoices.ts, 4 documentation files)
Modified files: 10
New dependencies: 1 (python-dotenv)
Breaking changes: 0

Example Usage

UI Auto-Detection

# Build frontend
cd client && npm run build && cd ..

# Server auto-detects - no flag needed!
SSL_DIR=$(mktemp -d); python -m moshi.server --ssl "$SSL_DIR"
# Logs: "Found custom UI at .../client/dist, using it instead of default"

Custom Voices

# Add voice file
cp my_voice.wav custom_voices/

# Generate embeddings
python -m moshi.offline --voice-prompt "my_voice.wav" \
  --save-voice-embeddings --input-wav "assets/test/input_assistant.wav" --output-wav "/tmp/out.wav"

# Restart server - voice appears in UI automatically!

.env Setup

# One-time setup
cp .env.example .env
# Edit .env and add: HF_TOKEN=your_token_here

# All commands now work without export
python -m moshi.server --ssl "$SSL_DIR"

Related Issues

This PR addresses user experience friction mentioned in community discussions about:

Simplifying frontend development workflow
Making custom voices easier to add
Reducing setup friction for new users

All features have been tested and verified working. Ready for review and merge. 🚀

add containerized support for personaplex

…x for Blackwell GPUs.

Install build dependencies and libopus-dev to fix Docker build

- Add --save-voice-embeddings CLI flag to offline.py for generating custom voice prompt embeddings from WAV files - Remove torch < 2.5 upper bound to allow PyTorch 2.10+ for RTX 5090 - Add missing pyloudnorm dependency required for audio normalization - Update README with conda setup instructions, Blackwell GPU guide, and custom voice creation tutorial - Update .gitignore for Claude Code local settings

…ads model layers to CPU when GPU memory is insufficient.

Add libopus-dev to installation prerequisites

Add low VRAM feature

fix: reduce memory need during model init

- Add python-dotenv dependency to pyproject.toml - Load environment variables from .env file in server.py and offline.py - Add warning when .env exists but HF_TOKEN is not set - Create .env.example template for users - Update README.md with .env configuration instructions The .env file is optional and all existing workflows continue to work. Users can now configure HF_TOKEN via .env file, environment variable, or huggingface-cli login.

Implement dynamic voice discovery system that allows users to add custom voices without modifying code. Voices automatically appear in the Web UI dropdown after generating embeddings and restarting the server. Backend changes: - Add VoiceDiscovery service to scan configured voice directories - Add /api/voices REST endpoint returning voice list with metadata - Support custom voices directory (configurable via CUSTOM_VOICE_DIR) - Only list .pt embedding files (not .wav source audio) Frontend changes: - Add useVoices React hook for dynamic voice fetching - Update Queue and ModelParams components to use dynamic voice loading - Add loading and error states for better UX - Custom voices appear first in dropdown Infrastructure: - Add custom_voices/ directory with comprehensive README - Update .gitignore to exclude voice files but keep directory structure - Add TROUBLESHOOTING.md documenting common issues - Update README.md with installation, server, and custom voice docs Key fixes applied during implementation: - Package must be installed in editable mode (pip install -e .) for dev - Server needs --static client/dist flag to serve local frontend builds - API routes must be registered before static routes in aiohttp - Critical: Only .pt files are selectable voices (not .wav source files)

Server now automatically detects and serves custom UI from client/dist without requiring --static flag, simplifying development workflow. Falls back to HuggingFace default UI when custom build is unavailable. Adds comprehensive documentation including QUICKSTART.md and FRONTEND_DEVELOPMENT.md guides.

Implement dynamic voice discovery system that allows users to add custom voices without modifying code. Voices automatically appear in the Web UI dropdown after generating embeddings and restarting the server. Backend changes: - Add VoiceDiscovery service to scan configured voice directories - Add /api/voices REST endpoint returning voice list with metadata - Support custom voices directory (configurable via CUSTOM_VOICE_DIR) - Only list .pt embedding files (not .wav source audio) Frontend changes: - Add useVoices React hook for dynamic voice fetching - Update Queue and ModelParams components to use dynamic voice loading - Add loading and error states for better UX - Custom voices appear first in dropdown Infrastructure: - Add custom_voices/ directory with comprehensive README - Update .gitignore to exclude voice files but keep directory structure - Add TROUBLESHOOTING.md documenting common issues - Update README.md with installation, server, and custom voice docs Key fixes applied during implementation: - Package must be installed in editable mode (pip install -e .) for dev - Server needs --static client/dist flag to serve local frontend builds - API routes must be registered before static routes in aiohttp - Critical: Only .pt files are selectable voices (not .wav source files)

Server now automatically detects and serves custom UI from client/dist without requiring --static flag, simplifying development workflow. Falls back to HuggingFace default UI when custom build is unavailable. Adds comprehensive documentation including QUICKSTART.md and FRONTEND_DEVELOPMENT.md guides.

- Remove .env file configuration instructions - Update to use export HF_TOKEN instead - Change CUSTOM_VOICE_DIR docs to use export - Keep only environment variable and huggingface-cli login methods

- Add practical examples for UI auto-detection - Add example for custom voice workflow - Show expected log output for verification

Update main branch with clean PR version: - Smart UI auto-detection feature - Dynamic custom voice system - All .env references removed - Complete documentation added

- Add .env.example template with HF_TOKEN and CUSTOM_VOICE_DIR - Update documentation to recommend .env file as primary method - .env files are automatically loaded by server and offline scripts - Maintains backward compatibility with export and huggingface-cli methods Benefits of .env over export will be explained in PR description.

- Ignore .agent/ directory - Ignore Agents.md - Ignore Claude.local.md These are personal tooling files that should never be in the repository.

Resolved conflicts in README.md and moshi/moshi/offline.py: - README.md: Preserved our three-option auth setup (.env, export, CLI) and auto-detection documentation - offline.py: Kept save_embeddings feature (our addition) vs upstream's hardcoded False Our additions preserved: - .env file support for easier token management - UI auto-detection with detailed documentation - Custom voice system with dynamic loading - save-voice-embeddings feature in offline mode - QUICKSTART.md and FRONTEND_DEVELOPMENT.md guides

Successfully merged upstream/main with our feature branch containing: - UI auto-detection system - Dynamic custom voice loading - .env file support for token management All conflicts resolved, tests passed, ready for upstream PR.

rajarshiroy-nvidia and others added 30 commits January 15, 2026 10:52

Update README.md

617b2f4

Update README.md

1def0c6

add containerized support for personaplex

1535caa

Merge pull request NVIDIA#1 from tuttlebr/btuttle/container-build

261cbe6

add containerized support for personaplex

Temporary fix for HF downloads tracking.

62ae4f7

Update README.md

e4e8f4c

Install build dependencies and libopus-dev to fix Docker build

0947e85

Removed unused dependencies

b0b78e4

Update README.md to update permanent discord link and installation fi…

828e15f

…x for Blackwell GPUs.

Merge pull request NVIDIA#7 from wachawo/main

195768c

Install build dependencies and libopus-dev to fix Docker build

Add --cpu-offload flag for both server and offline modes, which offlo…

aaa0692

…ads model layers to CPU when GPU memory is insufficient.

docs: add mention of opus dependency

ce22eeb

Update README.md

0eda307

Update README.md

f97974b

Merge pull request NVIDIA#15 from jaycoolslm/docs/add-opus-prereq

20e9dc4

Add libopus-dev to installation prerequisites

Merge pull request NVIDIA#14 from grafael/cpu_offload

2e6b4e4

Add low VRAM feature

Update README.md

c9f6d0a

fix: reduce memory need during model init

0c3dd8b

Merge origin/main: integrate cpu-offload with custom voice support

7f364f0

Merge pull request NVIDIA#18 from tsdocode/fix/init-oom

49e3d0a

fix: reduce memory need during model init

Merge upstream/main: reduce memory need during model init

ca452ae

Remove .env documentation from PR

a2d59cb

- Remove .env file configuration instructions - Update to use export HF_TOKEN instead - Change CUSTOM_VOICE_DIR docs to use export - Keep only environment variable and huggingface-cli login methods

Add Example Usage section to README

d68aff1

- Add practical examples for UI auto-detection - Add example for custom voice workflow - Show expected log output for verification

Merge pr-ui-and-voice-features into main

86174a9

Update main branch with clean PR version: - Smart UI auto-detection feature - Dynamic custom voice system - All .env references removed - Complete documentation added

kiralpoon added 4 commits January 27, 2026 17:59

Add .gitignore rules for Claude Code tooling files

e38d377

- Ignore .agent/ directory - Ignore Agents.md - Ignore Claude.local.md These are personal tooling files that should never be in the repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Smart UI Auto-Detection, Dynamic Voice System, and .env Support#37

Add Smart UI Auto-Detection, Dynamic Voice System, and .env Support#37
kiralpoon wants to merge 34 commits intoNVIDIA:mainfrom
kiralpoon:pr-all-features

kiralpoon commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

kiralpoon commented Jan 27, 2026

Features

1. Smart UI Auto-Detection

2. Dynamic Custom Voice System

3. .env File Support

Changes

Backward Compatibility

Testing

Statistics

Example Usage

UI Auto-Detection

Custom Voices

.env Setup

Related Issues

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants