-
Notifications
You must be signed in to change notification settings - Fork 41
Open
Description
Summary
In demo/web_demo/WebRTC_Demo, duplex mode can show normal subtitles/state transitions but produce no audible output, or very choppy output (sentence split into discontinuous fragments).
Environment
- macOS (Apple Silicon), M1 Max, 64GB RAM
- WebRTC_Demo started via
bash oneclick.sh start - model:
openbmb/MiniCPM-o-4_5-gguf - LiveKit + backend + cpp server all healthy
Symptoms
- Frontend receives
<state><audio_start>,<state><generate_end>, subtitles are correct - Browser side shows remote audio track attached, but sound is missing or highly discontinuous
- C++ logs show generation succeeded and WAV chunks sent
Example (from logs):
- total generation time ~4.7s
- total audio duration ~2.0s
- RTF ~2.35x
- only a few chunks sent
Root Cause (confirmed locally)
In backend audio path, int16 PCM is directly fed into scipy.signal.resample_poly(...) in voice_chat/omni_stream.py. In this path, resampled output can become near-zero/all-zero for some inputs, so LiveKit receives effectively silent frames.
Additionally, coarse chunking + queue underflow + aggressive generate_end/play_end transition can make output choppy.
Minimal Fix
Before resampling:
- Convert source audio to normalized float32 in [-1, 1]
- Run
resample_polyon float32 - Convert back to int16 after resample
This fixed the silent-audio issue immediately in local verification.
Suggested Code Locations
WebRTC_Demo/WebRTC_Demo/omini_backend_code/code/voice_chat/omni_stream.py- model wav -> WebRTC path
- local TTS wav -> WebRTC path
- any helper resampling functions
Optional Improvements for Choppy Playback
- reduce output chunk size (finer granularity)
- avoid queue starvation in
output_audio - delay
play_enddecision to avoid premature turn-end during short gaps
Repro Steps
cd demo/web_demo/WebRTC_Demobash oneclick.sh start- open frontend URL, start duplex voice conversation
- observe subtitle/state normal but audio silent/choppy
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels