GitHub - gdivakov/pixel-streaming-server: This backend system consists of several components that enable real-time conversational characters using Pixel Streaming, STT, TTS, LLM integration, and lipsync.

Overview

This backend system consists of several components that enable real-time conversational characters using Pixel Streaming, STT, TTS, LLM integration, and lipsync. Communication between Unreal Engine (UE) and the backend is handled via WebSockets.

Pipeline Flow:

When a player speaks to a UE character, the audio is captured as a WAV file.
The backend Node.js controller (communicator) receives a notify from UE and forwards the audio to the speech-to-text (STT) service.
The STT service converts the stereo signal to mono, resamples to 16 kHz, and runs it through a speech recognition model.
The recognized text is sent to the LLM service, which generates a response.
Once the LLM response is ready, the Node.js controller notifies the text-to-speech (TTS) and lipsync services via HTTP:
http://127.0.0.1:5001/speak → TTS service (generates audio).
http://127.0.0.1:5002/process_wav → Lipsync service (processes audio).
The TTS service outputs a WAV file, which is then passed to the lipsync service.
The lipsync service generates blendshapes, streams them to UE via Live Link (WebSockets), and plays the audio.
The UE character consumes the blendshapes and applies them in real time for synchronized facial animation.

Current Limitations & Improvements:

Instead of writing/reading WAV files between services, use binary streaming (if possible) (WebSockets or gRPC) for lower latency.
Each LLM-dependent service should run as a standalone server connected to the Node.js controller via WebSockets.
Fault tolerance. The Node.js controller should implement fallback and reconnection logic to ensure resilience (e.g., if one service crashes, reconnect with retries and failover to a backup).
Currently, audio playback happens inside the lipsync service, which only works locally. In production, audio should be sent back to UE for playback.
For smoother synchronization, blendshapes should only be streamed when UE has confirmed it received the audio and is ready to play it.
Low latency must remain the top priority when introducing any improvements.

Installation

Not all dependencies may be listed here, but here are the ones you definitely need to run the system locally:

Run npm install
Install sox to convert stereo to mono realime. https://sourceforge.net/projects/sox/
Install required dependencies for python services \services\neuro-sync\Local_API and \services\neuro-sync\Player
Required models must download when u start the project first time.
Edit absolute paths for start.bat. This is the entry point which opens all required services in a Windows Terminal grid.

if all the dependencies are here and paths are correct u can run
npm run dev to start the project

Additional notes

neuro-sync is a third-party lib I used to handle lipsync blendshape generation and tts. I built a server.py wrapper around it to expose its functionality as a service.
https://github.com/AnimaVR/NeuroSync_Player
https://github.com/AnimaVR/NeuroSync_Local_API

License

This project is released under a dual-license model:

Free / MIT License
- Free for individuals and organizations earning under $1M/year.
- Covers all original code authored in this project.
- See the full MIT license in LICENSE.
Commercial License
- Required for organizations with annual revenue of $1,000,000 or more.
- Provides extended rights, priority support, and permission to integrate into proprietary systems.
- Contact divakov.gleb@gmail.com to obtain a commercial license.

Third-Party Licenses

This project uses third-party components licensed under their own terms.

NeuroSync (NeuroSync Local_API) — used for lipsync blendshape generation and TTS.
- Free for individuals and small organizations under $1M/year.
- Commercial license required for larger organizations.
- Full license text: Local_API/LICENCE
NeuroSync Player — used for handling player-side functionality and integration with NeuroSync services.
- Free for individuals and small organizations under $1M/year.
- Commercial license required for larger organizations.
- Full license text: Player/LICENCE

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
services		services
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
start.bat		start.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

Pipeline Flow:

Current Limitations & Improvements:

Installation

Additional notes

License

Third-Party Licenses

About

Uh oh!

Releases

Packages

Languages

License

gdivakov/pixel-streaming-server

Folders and files

Latest commit

History

Repository files navigation

Overview

Pipeline Flow:

Current Limitations & Improvements:

Installation

Additional notes

License

Third-Party Licenses

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages