A real-time voice AI agent using LiveKit and Google's Gemini Realtime API for natural conversation.
- Real-time bidirectional voice conversation with AI
- Natural speech processing and response generation
- Web-based interface for easy access
- Continuous conversation flow (not just single responses)
- LiveKit - Real-time audio streaming
- Google Gemini API - AI conversation model
- Flask - Web backend for token generation
- HTML/JavaScript - Browser-based voice interface
Prerequisites: Install Docker Desktop
-
Set up environment:
cp env.example .env
Edit
.envwith your LiveKit and Google Cloud credentials. -
Run with Docker:
docker-compose up --build
-
Start conversation:
- Open http://localhost:5000
- Click "Join Conversation"
- Allow microphone access
- Start talking with the AI agent
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment:
cp env.example .env
Edit
.envwith your LiveKit and Google Cloud credentials. -
Run the application:
python run_webui.py
-
Start conversation:
- Open http://localhost:5000
- Click "Join Conversation"
- Allow microphone access
- Start talking with the AI agent
LIVEKIT_URL=wss://your-livekit-server.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
GOOGLE_API_KEY=your_google_api_key