This project is a voice-based conversational AI agent built using LiveKit, DeepGram, and OpenAI.
The system consists of three main parts:
- LiveKit Server: Acts as the central media server, routing audio between the user and the agent.
- Backend Agent (Python): A Python application that listens to the user, transcribes their speech, generates a response using an LLM, and speaks the response back.
- Frontend Client (Next.js): A web application that provides the user interface, handles the connection to the LiveKit room, and displays the conversation.
- Node.js (v18 or later)
- Python 3.11+ and
uv(https://github.com/astral-sh/uv) - Docker (for containerized deployment)
- Access keys for LiveKit, OpenAI, and DeepGram.
-
Clone the repository:
git clone <repository-url> cd voice-agent
-
Configure Environment Variables:
Copy the example environment file and fill in your API keys:
cp .env.example .env
You will need to do the same for the frontend:
cp frontend/.env.local.example frontend/.env.local
-
Navigate to the backend directory:
cd backend -
Install dependencies:
uv sync -
Download VAD models:
uv run python download.py
-
Run the agent:
uv run python -m src.main dev
- Navigate to the frontend directory:
cd frontend - Install dependencies:
npm install - Run the development server:
npm run dev - Open your browser to
http://localhost:3000