Refeature/interrupt-handler-SirjanSingh#498
Refeature/interrupt-handler-SirjanSingh#498SirjanSingh wants to merge 5 commits intoDark-Sys-Jenkins:mainfrom
Conversation
Updated README to reflect intelligent interruption handling implementation for LiveKit Voice Agent, detailing challenges, solutions, and key code changes.
Refactor basic agent to implement a hybrid interruption strategy for better command recognition and filler handling.
Refine the hybrid interruption handling strategy to better distinguish between filler words and commands based on timing.
There was a problem hiding this comment.
Pull request overview
This PR refactors the basic_agent.py voice agent to implement a hybrid interruption-handling strategy that distinguishes filler acknowledgments from explicit stop commands, and adds detailed documentation in examples/voice_agents/README.md explaining the design.
Changes:
- Implemented
IntelligentAgentinbasic_agent.pywith VAD configuration, filler/command word detection, and event-driven state tracking (speech_created,agent_state_changed,user_state_changed,user_input_transcribed) to realize the hybrid interruption strategy. - Tuned
AgentSessionoptions (VAD thresholds, interruption settings, endpointing delays) to support the new behavior and added extensive logging for debugging. - Rewrote
examples/voice_agents/README.mdto document the intelligent interruption handler, including requirements, design rationale, configuration parameters, and testing scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
examples/voice_agents/basic_agent.py |
Replaces the simple basic agent with IntelligentAgent, configuring VAD/interrupt options and transcript-based logic to differentiate fillers from stop commands, and wires up relevant session event handlers with diagnostic logging. |
examples/voice_agents/README.md |
Replaces the generic voice-agents index with a detailed design doc for the intelligent interruption handler implemented in basic_agent.py, covering behavior, configuration, and test scenarios. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
examples/voice_agents/basic_agent.py
Outdated
| import logging | ||
| import re | ||
| import asyncio | ||
| from typing import Optional | ||
| from dotenv import load_dotenv | ||
| from livekit.agents import ( | ||
| Agent, | ||
| AgentServer, | ||
| AgentSession, | ||
| JobContext, | ||
| JobProcess, | ||
| MetricsCollectedEvent, | ||
| RunContext, | ||
| cli, | ||
| metrics, | ||
| room_io, | ||
| Agent, AgentServer, AgentSession, JobContext, JobProcess, | ||
| cli, UserInputTranscribedEvent, AgentStateChangedEvent, | ||
| UserStateChangedEvent |
There was a problem hiding this comment.
These imports (asyncio, Optional, UserInputTranscribedEvent, AgentStateChangedEvent, UserStateChangedEvent) are currently unused in this module and can be removed to reduce clutter and keep dependencies minimal, or alternatively used for type annotations on the event handlers if that was the intention.
examples/voice_agents/basic_agent.py
Outdated
|
|
||
| self.is_speaking = False | ||
| self.was_interrupted_by_vad = False | ||
| self.last_speech_content = "" |
There was a problem hiding this comment.
last_speech_content is written when speech_created fires but never read anywhere, so this state adds complexity without affecting behavior; consider either removing it or wiring it into a concrete resume/use case.
| self.last_speech_content = "" |
examples/voice_agents/basic_agent.py
Outdated
| except: | ||
| logger.warning("⚠️ False interruption event not available in this LiveKit version") |
There was a problem hiding this comment.
Using a bare except: here will swallow system-exiting exceptions (e.g., KeyboardInterrupt, SystemExit) and makes debugging harder; it would be safer to catch Exception explicitly or a narrower error type for feature-detection of agent_false_interruption.
| except: | |
| logger.warning("⚠️ False interruption event not available in this LiveKit version") | |
| except Exception as exc: | |
| logger.warning( | |
| "⚠️ False interruption event not available in this LiveKit version: %s", | |
| exc, | |
| ) |
| **Requirements:** | ||
| 1. **When agent is speaking + user says filler** → Agent continues uninterrupted | ||
| 2. **When agent is speaking + user says command** → Agent stops immediately | ||
| 3. **When agent is silent** → All user speech is valid input |
There was a problem hiding this comment.
The requirements section states that "When agent is silent → All user speech is valid input" (point 3), but later in the document and in basic_agent.py filler input while idle is explicitly suppressed; the requirements text should be updated to match the actual designed behavior (idle fillers are ignored).
| 3. **When agent is silent** → All user speech is valid input | |
| 3. **When agent is silent** → Non-filler user speech is valid input; idle fillers may be ignored |
| # Intelligent Interruption Handling for LiveKit Voice Agent | ||
|
|
||
| This directory contains a comprehensive collection of voice-based agent examples demonstrating various capabilities and integrations with the LiveKit Agents framework. | ||
| ## Overview | ||
|
|
||
| ## 📋 Table of Contents | ||
| This document explains the modifications made to `basic_agent.py` to implement intelligent interruption handling that distinguishes between **filler words** (acknowledgments like "yeah", "okay") and **command words** (interruptions like "stop", "wait"). |
There was a problem hiding this comment.
This README now focuses solely on the intelligent interruption handler in basic_agent.py, but the examples/voice_agents directory still contains many other example scripts that were previously indexed here; consider restoring or relocating a lightweight table of contents so users can still discover the rest of the voice agent examples from this README.
examples/voice_agents/basic_agent.py
Outdated
| ) | ||
| from livekit.agents.llm import function_tool | ||
| from livekit.plugins import silero | ||
| from livekit.plugins import silero, deepgram, openai, cartesia |
There was a problem hiding this comment.
Import of 'deepgram' is not used.
Import of 'openai' is not used.
Import of 'cartesia' is not used.
| from livekit.plugins import silero, deepgram, openai, cartesia | |
| from livekit.plugins import silero |
Refactor IntelligentAgent to improve command and filler detection logic, update logging for clarity, and streamline state management.
Added student details and improved documentation on intelligent interruption handling in voice agents.
PR: Implement Hybrid Intelligent Interruption Handler
Summary
Refactored the voice agent to implement a hybrid interruption handling strategy that correctly differentiates between filler words and stop commands.
Problem Solved
Solution
resume_false_interruption=TrueKey Changes
@function_toolweather lookup (moving to separate PR)was_interrupted_by_vad)speech_createdanduser_state_changedevent handlers