Feature/interrupt handler sh4shv4t#497
Open
sh4shv4t wants to merge 2 commits intoDark-Sys-Jenkins:mainfrom
Open
Feature/interrupt handler sh4shv4t#497sh4shv4t wants to merge 2 commits intoDark-Sys-Jenkins:mainfrom
sh4shv4t wants to merge 2 commits intoDark-Sys-Jenkins:mainfrom
Conversation
Implement context-aware interruption filtering that distinguishes between
passive acknowledgements ("yeah", "ok", "hmm") and intentional interruptions
("stop", "wait") based on agent speaking state.
Key changes:
- Add configurable `ignored_words` option to AgentSession with sensible defaults
- Add `_is_soft_input()` and `_should_ignore_interruption()` helpers in AgentActivity
- Pass STT transcript to interruption handler for semantic filtering
- Support LIVEKIT_AGENT_IGNORED_WORDS env var for runtime configuration
When agent is speaking, backchannel words are ignored seamlessly without
pause or stutter. When agent is silent, all input is processed normally.
Includes demo script and unit tests (11 tests passing).
β¦uzzy matching
Add context-aware interruption detection to distinguish between backchannel
responses ("yeah", "okay", "hmm") and genuine interruptions ("stop", "wait").
This enables natural conversation flow where users can acknowledge they're
listening without disrupting the agent.
Key Features:
- Configurable fuzzy string matching using rapidfuzz (default 80% threshold)
- Handles STT typos and variations automatically ("yeahh" β "yeah" @ 88%)
- Sub-millisecond performance with process.extractOne optimization
- State-aware: only filters interruptions when agent is speaking
- Robust error handling with safe fallback behavior
- 16 default backchannel words (configurable via param or env var)
- Comprehensive debug logging for production troubleshooting
Technical Implementation:
- agent_activity.py: Add _is_soft_input() and _should_ignore_interruption()
with fuzzy matching, error handling, and performance optimizations
- agent_session.py: Add DEFAULT_IGNORED_WORDS, fuzzy_match_threshold param,
and environment variable support (LIVEKIT_AGENT_IGNORED_WORDS)
- Chose fuzzy matching over semantic embeddings due to latency (<1ms vs 50-200ms)
Testing & Documentation:
- 24 comprehensive tests covering exact/fuzzy matching, edge cases, thresholds
- Demo application with usage examples and configuration display
- Complete technical specification in PLAN.md with 8-minute video script
- Interactive demonstration_walkthrough.py script with mock scenarios
- Enhanced README.md with detailed feature description
- PR_MESSAGE.md with comprehensive implementation details
- Token generation utility (generate_token.py) for LiveKit playground
Behavior Matrix:
- "yeah/okay/hmm" while speaking β agent continues (backchannel)
- "yeahh/okayy" while speaking β agent continues (fuzzy match)
- "wait/stop/no" while speaking β agent stops (real interruption)
- "yeah but wait" while speaking β agent stops (mixed input)
- Any input when silent β processed normally
Configuration:
- Default: fuzzy_match_threshold=80 (balanced)
- Lenient: fuzzy_match_threshold=70 (noisy/accents)
- Strict: fuzzy_match_threshold=90 (formal/clear audio)
- Exact: fuzzy_match_threshold=100 (testing/debugging)
Breaking Changes: None (backward compatible)
Dependencies: Added rapidfuzz>=3.0.0 for fuzzy string matching
Closes: Intelligent interruption handling implementation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Intelligent Interruption Handling with Fuzzy Matching by Shashvat Singh
Summary
Implements intelligent interruption detection using configurable fuzzy string matching to distinguish between backchannel responses (e.g., "yeah", "okay", "hmm") and genuine interruptions (e.g., "stop", "wait"). Enables natural conversation flow by allowing users to acknowledge they're listening without disrupting the agent.
Demo
π₯ Video walkthrough & technical deep-dive: Video Link
# Quick test uv run pytest tests/test_intelligent_interruption.py -vProblem
Traditional voice agents treat all user speech uniformly, causing poor UX:
Solution
Fuzzy string matching with rapidfuzz for intelligent classification:
process.extractOneTechnical decision: Chose fuzzy matching over semantic embeddings due to latency requirements (<1ms vs 50-200ms). Real-time voice demands immediate responses; embeddings would add user-perceivable delay.
Changes
livekit-agents/livekit/agents/voice/agent_activity.py_is_soft_input(): Fuzzy matching with configurable threshold, error handling, debug logging_should_ignore_interruption(): State-aware interruption logicprocess.extractOnevs O(nΒ²) nested loopslivekit-agents/livekit/agents/voice/agent_session.pyDEFAULT_IGNORED_WORDS: 16 common backchannel wordsAgentSessionOptions.fuzzy_match_threshold: Configurable 0-100% (default 80)LIVEKIT_AGENT_IGNORED_WORDStests/test_intelligent_interruption.pyexamples/voice_agents/intelligent_interruption_demo.pyDocumentation
README.md: Enhanced feature descriptiondemonstration_walkthrough.py: Interactive demo scriptAcceptance Criteria β
Testing
Automated
Manual
Configuration Examples
Key Features
Dependencies
Breaking Changes
None. Backward-compatible enhancement - existing agents work without changes.
Author: Shashvat Singh
Date: February 2, 2026