Skip to content

umara25/VoiceAssistant

Repository files navigation

Voice Agent Prototype

A real-time voice AI agent using LiveKit and Google's Gemini Realtime API for natural conversation.

What It Does

  • Real-time bidirectional voice conversation with AI
  • Natural speech processing and response generation
  • Web-based interface for easy access
  • Continuous conversation flow (not just single responses)

Built With

  • LiveKit - Real-time audio streaming
  • Google Gemini API - AI conversation model
  • Flask - Web backend for token generation
  • HTML/JavaScript - Browser-based voice interface

Quick Start

Option 1: Docker (Recommended)

Prerequisites: Install Docker Desktop

  1. Set up environment:

    cp env.example .env

    Edit .env with your LiveKit and Google Cloud credentials.

  2. Run with Docker:

    docker-compose up --build
  3. Start conversation:

    • Open http://localhost:5000
    • Click "Join Conversation"
    • Allow microphone access
    • Start talking with the AI agent

Option 2: Local Development

  1. Install dependencies:

    pip install -r requirements.txt
  2. Set up environment:

    cp env.example .env

    Edit .env with your LiveKit and Google Cloud credentials.

  3. Run the application:

    python run_webui.py
  4. Start conversation:

    • Open http://localhost:5000
    • Click "Join Conversation"
    • Allow microphone access
    • Start talking with the AI agent

Environment Variables

LIVEKIT_URL=wss://your-livekit-server.livekit.cloud
LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
GOOGLE_API_KEY=your_google_api_key

About

AI Voice assistant made with LiveKit, Gemini and Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published