Skip to content

Comments

Mobile voice capture integration + baseline waveform, permissions, and cleanup#17

Open
fbraza wants to merge 1 commit intomainfrom
feature/controls-issue-4
Open

Mobile voice capture integration + baseline waveform, permissions, and cleanup#17
fbraza wants to merge 1 commit intomainfrom
feature/controls-issue-4

Conversation

@fbraza
Copy link
Contributor

@fbraza fbraza commented Sep 30, 2025

Summary

This PR implements the core of issue #7 (audio + speech integration) while keeping the experience focused on mobile (Android/iOS) and avoiding web-specific capture code for now. It wires microphone amplitude into the waveform and routes partial speech results into the transcription ViewModel. Baseline rendering is added so the waveform is visible even when silent or before recording starts.

Key Changes

  • Services
    • AudioCaptureService + AudioCaptureServiceImpl (record): Emits normalized amplitudes from onAmplitudeChanged; handles start/pause/resume/stop and permission checks.
    • SpeechRecognitionService + SpeechRecognitionServiceImpl (speech_to_text): Streams incremental transcription via transcriptionStream; uses SpeechListenOptions (no deprecations).
  • State & UI
    • ViewModel (VoiceToTextModelState):
      • Seeds a sliding amplitude window with zeros (flat baseline) and updates it live during recording.
      • Coordinates recording lifecycle (start/pause/resume/stop/discard/restart), managing service subscriptions and timer.
      • Exposes transcribedText, lastError (SnackBar surfaced), and continues to drive the existing widgets.
    • Waveform painter: Centers bars and guarantees a visible baseline via minBarHeight for silent frames.
  • Platform setup
    • Android: RECORD_AUDIO permission in AndroidManifest.xml.
    • iOS: NSMicrophoneUsageDescription and NSSpeechRecognitionUsageDescription in Info.plist.
  • DI / App
    • main.dart: Provides audio + speech services via Provider and boots into the voice screen.
  • Documentation
    • README updated with dependency and permission notes.

Out of Scope / Notes

  • Web capture code was intentionally removed to stay aligned with the mobile-first scope. The baseline remains visible in web builds, but real capture is deferred until we target web explicitly.
  • Dependency updates required by pre-commit policy: record ^6.1.1, speech_to_text ^7.3.0, path_provider ^2.1.5.

Testing

  • Ran flutter analyze (clean) and flutter test (green). Unit tests updated to expect a seeded flat baseline when idle.

Screens/QA

  • On device/emulator: Allow mic permission, tap the mic to start recording. The waveform reflects voice intensity, and transcription updates as you speak. Pause/resume/stop/discard flows work as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant