Mobile voice capture integration + baseline waveform, permissions, and cleanup by fbraza · Pull Request #17 · iva-mobile/client

fbraza · 2025-09-30T17:14:34Z

Summary

This PR implements the core of issue #7 (audio + speech integration) while keeping the experience focused on mobile (Android/iOS) and avoiding web-specific capture code for now. It wires microphone amplitude into the waveform and routes partial speech results into the transcription ViewModel. Baseline rendering is added so the waveform is visible even when silent or before recording starts.

Key Changes

Services
- AudioCaptureService + AudioCaptureServiceImpl (record): Emits normalized amplitudes from onAmplitudeChanged; handles start/pause/resume/stop and permission checks.
- SpeechRecognitionService + SpeechRecognitionServiceImpl (speech_to_text): Streams incremental transcription via transcriptionStream; uses SpeechListenOptions (no deprecations).
State & UI
- ViewModel (VoiceToTextModelState):
  - Seeds a sliding amplitude window with zeros (flat baseline) and updates it live during recording.
  - Coordinates recording lifecycle (start/pause/resume/stop/discard/restart), managing service subscriptions and timer.
  - Exposes transcribedText, lastError (SnackBar surfaced), and continues to drive the existing widgets.
- Waveform painter: Centers bars and guarantees a visible baseline via minBarHeight for silent frames.
Platform setup
- Android: RECORD_AUDIO permission in AndroidManifest.xml.
- iOS: NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription in Info.plist.
DI / App
- main.dart: Provides audio + speech services via Provider and boots into the voice screen.
Documentation
- README updated with dependency and permission notes.

Out of Scope / Notes

Web capture code was intentionally removed to stay aligned with the mobile-first scope. The baseline remains visible in web builds, but real capture is deferred until we target web explicitly.
Dependency updates required by pre-commit policy: record ^6.1.1, speech_to_text ^7.3.0, path_provider ^2.1.5.

Testing

Ran flutter analyze (clean) and flutter test (green). Unit tests updated to expect a seeded flat baseline when idle.

Screens/QA

On device/emulator: Allow mic permission, tap the mic to start recording. The waveform reflects voice intensity, and transcription updates as you speak. Pause/resume/stop/discard flows work as expected.

Bump path_provider to ^2.1.5 for pre-commit policy

20abdbe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Mobile voice capture integration + baseline waveform, permissions, and cleanup#17

Mobile voice capture integration + baseline waveform, permissions, and cleanup#17
fbraza wants to merge 1 commit intomainfrom
feature/controls-issue-4

fbraza commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

fbraza commented Sep 30, 2025

Summary

Key Changes

Out of Scope / Notes

Testing

Screens/QA

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant