Skip to content

Comments

Live transcription: idle onboarding (no highlight) + real-time speech#18

Open
fbraza wants to merge 7 commits intomainfrom
feature/live-transcription-issue-7
Open

Live transcription: idle onboarding (no highlight) + real-time speech#18
fbraza wants to merge 7 commits intomainfrom
feature/live-transcription-issue-7

Conversation

@fbraza
Copy link
Contributor

@fbraza fbraza commented Sep 30, 2025

Summary\nThis PR ensures live transcription text displays in real time during recording and no text is highlighted when idle.\n\n### Fixes\n- Clear onboarding transcript on startRecording so incoming speech is shown immediately.\n- Cursor/active chip appears only while recording; hidden when idle/paused/stopped.\n- Speech recognition now always initializes if needed before listening, and ignores empty partials.\n\n### Files\n- lib/core/services/speech_recognition.dart — init guard, partial handling.\n- lib/features/voice_to_text/view/voice_to_text_model.dart — clear transcript and toggle cursor on start, keep highlight behavior tied to lifecycle.\n- lib/features/voice_to_text/view/voice_to_text_screen.dart — startup state with no highlight.\n\n### Verification\n- flutter analyze (clean), flutter test (green).\n- On device/emulator: press mic, speak; text area streams recognized words; pausing/stopping hides the cursor and keeps text.\n

… recording\n\n- Replace hard-coded demo transcript with onboarding instruction\n- Chip + blinking cursor now appear only when recording (isCursorVisible)\n- Toggle cursor visibility on recording lifecycle transitions\n- No changes to services; speech stream continues to update transcript in real time
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

Comment on lines 18 to 54
return ChangeNotifierProvider<VoiceToTextModelState>(
create: (_) => VoiceToTextModelState(
create: (context) => VoiceToTextModelState(
initialTranscript: const [
'Herman',
'is',
'just',
'like',
'Tap',
'the',
'rest',
'microphone',
'and',
'tell',
'your',
'AI',
'Chief',
'of',
'us.',
'Everyday',
'he',
'has',
'Staff',
'what',
'to',
'make',
'all',
'kin',
'do',
'—',
'send',
'a',
'Slack',
'message,',
'draft',
'an',
'email,',
'check',
'the',
'calendar,',
'or',
'follow',
'up',
'for',
'you.',
],
initialActiveWordIndex: 14,
audio: context.read<AudioCaptureService>(),
speech: context.read<SpeechRecognitionService>(),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Start idle screen with cursor still visible

The voice screen builds VoiceToTextModelState without overriding initialCursorVisible. Because the model’s default is true, the onboarding text still renders the active-word chip the moment the screen appears even though the recording state is idle. The headline feature here is that the chip should show only while recording, so the model needs to start with initialCursorVisible: false (or change the default) before displaying the onboarding copy.

Useful? React with 👍 / 👎.

Comment on lines 189 to +232
@override
void dispose() {
_timer?.cancel();
_waveformController.close();
_teardownStreams();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[P1] Disposing model leaves audio and speech services running

VoiceToTextModelState.dispose currently cancels timers and subscriptions but never stops the injected AudioCaptureService or SpeechRecognitionService. If the user navigates away while recording, the microphone and speech recognizer continue running in the background, because only _teardownStreams() is invoked. Disposing the model should call audio.stop() and speech.cancel() (or a similar teardown) to release system resources and avoid capturing audio after the UI disappears.

Useful? React with 👍 / 👎.

…n\n- initialCursorVisible=false and active index reset for idle\n- On startRecording, clear transcript and enable cursor so incoming speech streams immediately\n- Keep highlight strictly tied to recording state
@fbraza fbraza self-assigned this Sep 30, 2025
@fbraza fbraza changed the title Live transcription UI: onboarding text + highlight only while recording Live transcription: idle onboarding (no highlight) + real-time speech Sep 30, 2025
…ediately on record\n\n- Always initialize speech engine if not initialized before listening\n- Ignore empty partials; stream recognizedWords when available\n- Clear onboarding transcript at startRecording and enable cursor\n- Startup state uses no cursor/active highlighting
…permission\n\n- Resolve missing live text by ensuring engine is initialized and localeId is set\n- Configure listenFor/pauseFor to keep the session open\n- Add android INTERNET permission for cloud-backed recognition
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant