This project demonstrates an interactive avatar system that can be controlled through both text input and voice commands. The application uses the HeyGen Streaming Avatar API for the avatar functionality and OpenAI's Whisper API & Groq whisper-large-v3 as well for voice-to-text transcription.
- Real-time interactive avatar with speech capabilities
- Text input for direct communication with the avatar
- Voice command functionality using microphone input
- Financial analysis assistant powered by OpenAI
Before you begin, ensure you have met the following requirements:
- Node.js (v18 or later)
- npm (v9 or later)
- An API key for HeyGen Streaming Avatar
- An API key for OpenAI
-
Clone the repository:
git clone https://github.com/your-username/interactive-avatar-demo.git cd interactive-avatar-demo -
Install the dependencies:
npm install
-
Create a
.envfile in the root directory of the project and add your API keys:VITE_HEYGEN_API_KEY=your_heygen_api_key VITE_OPENAI_API_KEY=your_openai_api_key
To run the application in development mode:
npm run devThis will start the development server and open the application in your default browser. The application will be available at http://localhost:5173.
src/main.ts: Main application file that initializes the avatar and handles user interactionssrc/audio/audio-handler.ts: Handles audio recording and transcriptionsrc/openai/openai_assistant.ts: Implements the OpenAI assistant functionalityindex.html: Main HTML file with the application structuresrc/style.css: Global styles for the application
The application uses the following environment variables:
VITE_HEYGEN_API_KEY: Your HeyGen API key for accessing the streaming avatar serviceVITE_OPENAI_API_KEY: Your OpenAI API key for voice-to-text transcriptionVITE_GROQ_API_KEY: Your Groq API key for voice-to-text transcription