Feature/interrupt handler sh4shv4t by sh4shv4t · Pull Request #497 · Dark-Sys-Jenkins/agents-assignment

sh4shv4t · 2026-02-02T18:24:04Z

Intelligent Interruption Handling with Fuzzy Matching by Shashvat Singh

Summary

Implements intelligent interruption detection using configurable fuzzy string matching to distinguish between backchannel responses (e.g., "yeah", "okay", "hmm") and genuine interruptions (e.g., "stop", "wait"). Enables natural conversation flow by allowing users to acknowledge they're listening without disrupting the agent.

Demo

🎥 Video walkthrough & technical deep-dive: Video Link

# Quick test
uv run pytest tests/test_intelligent_interruption.py -v

Problem

Traditional voice agents treat all user speech uniformly, causing poor UX:

Users say "yeah" or "hmm" while agent speaks → agent incorrectly stops
STT variations like "yeahh" aren't recognized as backchannel
No configurability for different use cases (customer service vs accessibility)
Edge cases (empty strings, unicode) can crash interruption logic

Solution

Fuzzy string matching with rapidfuzz for intelligent classification:

Recognizes 16 default backchannel words with configurable threshold (default 80%)
Handles STT typos automatically ("yeahh" → "yeah" at 88% similarity)
Sub-millisecond performance using process.extractOne
State-aware: only filters when agent is speaking

Technical decision: Chose fuzzy matching over semantic embeddings due to latency requirements (<1ms vs 50-200ms). Real-time voice demands immediate responses; embeddings would add user-perceivable delay.

Agent State	User Says	Result
Speaking	"yeah", "hmm"	Continues (backchannel)
Speaking	"yeahh", "okayy"	Continues (fuzzy match)
Speaking	"wait", "stop"	Stops (real interruption)
Speaking	"yeah but wait"	Stops (mixed input)

Changes

livekit-agents/livekit/agents/voice/agent_activity.py

_is_soft_input(): Fuzzy matching with configurable threshold, error handling, debug logging
_should_ignore_interruption(): State-aware interruption logic
Performance: O(n) with process.extractOne vs O(n²) nested loops

livekit-agents/livekit/agents/voice/agent_session.py

DEFAULT_IGNORED_WORDS: 16 common backchannel words
AgentSessionOptions.fuzzy_match_threshold: Configurable 0-100% (default 80)
Environment variable: LIVEKIT_AGENT_IGNORED_WORDS

tests/test_intelligent_interruption.py

24 comprehensive tests across 5 classes
Coverage: exact matching, fuzzy matching, edge cases, configurable thresholds
All tests passing ✅

examples/voice_agents/intelligent_interruption_demo.py

Full demo agent with configuration examples
Displays ignored words and threshold at startup

Documentation

README.md: Enhanced feature description
demonstration_walkthrough.py: Interactive demo script

Acceptance Criteria ✅

✅ Agent continues over "yeah/okay/hmm" while speaking
✅ Fuzzy matching handles typos ("yeahh" → "yeah")
✅ Real interruptions work ("stop", "wait" → agent stops)
✅ Mixed input interrupts ("yeah but wait" → stops)
✅ Edge cases handled (empty, whitespace, unicode)
✅ Configurable thresholds (50%, 80%, 90%, 100%)
✅ Error resilient (doesn't crash on corrupt input)

Testing

Automated

# All 24 tests
uv run pytest tests/test_intelligent_interruption.py -v

# With coverage
uv run pytest tests/test_intelligent_interruption.py --cov=livekit.agents.voice

Manual

# Setup
cp examples/.env.example examples/.env  # Add your API keys
uv run examples/voice_agents/intelligent_interruption_demo.py download-files

# Test in console mode
uv run examples/voice_agents/intelligent_interruption_demo.py console

# Test with real voice in LiveKit playground
uv run examples/voice_agents/intelligent_interruption_demo.py dev
python3 generate_token.py  # Get connection URL

Configuration Examples

# Default (80% threshold, balanced)
agent = VoiceAgent()

# Lenient (70% - noisy environments, accents)
agent = VoiceAgent(fuzzy_match_threshold=70)

# Strict (90% - formal contexts, clear audio)
agent = VoiceAgent(fuzzy_match_threshold=90)

# Exact match only (100% - testing/debugging)
agent = VoiceAgent(fuzzy_match_threshold=100)

# Custom ignored words
agent = VoiceAgent(
    ignored_words=["yeah", "ok", "hmm", "sure", "right"],
    fuzzy_match_threshold=80
)

Key Features

Performance: Sub-millisecond fuzzy matching, no user-perceivable latency
Robustness: Try-except with safe fallback, detailed error logging
Flexibility: Threshold 0-100%, custom word lists, env var config
Observability: Debug logs with full transcript context

Dependencies

dependencies = ["rapidfuzz>=3.0.0"]

Breaking Changes

None. Backward-compatible enhancement - existing agents work without changes.

Author: Shashvat Singh
Date: February 2, 2026

Implement context-aware interruption filtering that distinguishes between passive acknowledgements ("yeah", "ok", "hmm") and intentional interruptions ("stop", "wait") based on agent speaking state. Key changes: - Add configurable `ignored_words` option to AgentSession with sensible defaults - Add `_is_soft_input()` and `_should_ignore_interruption()` helpers in AgentActivity - Pass STT transcript to interruption handler for semantic filtering - Support LIVEKIT_AGENT_IGNORED_WORDS env var for runtime configuration When agent is speaking, backchannel words are ignored seamlessly without pause or stutter. When agent is silent, all input is processed normally. Includes demo script and unit tests (11 tests passing).

…uzzy matching Add context-aware interruption detection to distinguish between backchannel responses ("yeah", "okay", "hmm") and genuine interruptions ("stop", "wait"). This enables natural conversation flow where users can acknowledge they're listening without disrupting the agent. Key Features: - Configurable fuzzy string matching using rapidfuzz (default 80% threshold) - Handles STT typos and variations automatically ("yeahh" → "yeah" @ 88%) - Sub-millisecond performance with process.extractOne optimization - State-aware: only filters interruptions when agent is speaking - Robust error handling with safe fallback behavior - 16 default backchannel words (configurable via param or env var) - Comprehensive debug logging for production troubleshooting Technical Implementation: - agent_activity.py: Add _is_soft_input() and _should_ignore_interruption() with fuzzy matching, error handling, and performance optimizations - agent_session.py: Add DEFAULT_IGNORED_WORDS, fuzzy_match_threshold param, and environment variable support (LIVEKIT_AGENT_IGNORED_WORDS) - Chose fuzzy matching over semantic embeddings due to latency (<1ms vs 50-200ms) Testing & Documentation: - 24 comprehensive tests covering exact/fuzzy matching, edge cases, thresholds - Demo application with usage examples and configuration display - Complete technical specification in PLAN.md with 8-minute video script - Interactive demonstration_walkthrough.py script with mock scenarios - Enhanced README.md with detailed feature description - PR_MESSAGE.md with comprehensive implementation details - Token generation utility (generate_token.py) for LiveKit playground Behavior Matrix: - "yeah/okay/hmm" while speaking → agent continues (backchannel) - "yeahh/okayy" while speaking → agent continues (fuzzy match) - "wait/stop/no" while speaking → agent stops (real interruption) - "yeah but wait" while speaking → agent stops (mixed input) - Any input when silent → processed normally Configuration: - Default: fuzzy_match_threshold=80 (balanced) - Lenient: fuzzy_match_threshold=70 (noisy/accents) - Strict: fuzzy_match_threshold=90 (formal/clear audio) - Exact: fuzzy_match_threshold=100 (testing/debugging) Breaking Changes: None (backward compatible) Dependencies: Added rapidfuzz>=3.0.0 for fuzzy string matching Closes: Intelligent interruption handling implementation

sh4shv4t added 2 commits January 31, 2026 02:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/interrupt handler sh4shv4t#497

Feature/interrupt handler sh4shv4t#497
sh4shv4t wants to merge 2 commits intoDark-Sys-Jenkins:mainfrom
sh4shv4t:feature/interrupt-handler-shashvat

sh4shv4t commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sh4shv4t commented Feb 2, 2026

Intelligent Interruption Handling with Fuzzy Matching by Shashvat Singh

Summary

Demo

Problem

Solution

Changes

Acceptance Criteria ✅

Testing

Automated

Manual

Configuration Examples

Key Features

Dependencies

Breaking Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant