Skip to content

Add ChatGPT-style voice transcription and image attachments #5

@PixelShober

Description

@PixelShober

Goal / Problem statement

  • Add ChatGPT-style voice input (microphone capture -> transcription -> insert into chat input) and image attachment flow (upload + thumbnail) aligned with Nele AI API.

Acceptance Criteria (bullet list)

  • Microphone button starts/stops recording and transcribes to text via /transcription; text appears in chat input for editing before send.
  • Image attachments use /image-attachment upload (png/jpg/webp only) and show a small thumbnail preview like ChatGPT; sending chat includes type=image and content=uploaded path.
  • Drag-and-drop overlay warns immediately for unsupported files or too-large audio/image based on API limits.
  • All changes include automated tests covering attachment policy, API payload shape, and UI elements.

Scope (in scope / out of scope)

  • In scope: WPF UI changes, upload/transcription API integration, attachment allowlist, previews, and tests.
  • Out of scope: Server-side changes or Nele API schema changes.

Constraints (perf, security, compatibility)

  • No secret logging; avoid blocking UI during transcription/upload; keep WPF responsive.
  • Use API limits from Nele docs (image-attachment types: png/jpg/webp; transcription: 200 MB audio formats).

Test expectations (unit/integration/manual)

  • Unit tests for attachment validation and API payload.
  • UI tests for microphone button, thumbnail preview, and overlay warnings.

Notes / links / screenshots (NO secrets)

  • Doc source: Clients/_Research/nele.ai documentation.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions