diff --git a/docs/speech-to-text/realtime/turn-detection.mdx b/docs/speech-to-text/realtime/turn-detection.mdx index 396993e9..d96fcf44 100644 --- a/docs/speech-to-text/realtime/turn-detection.mdx +++ b/docs/speech-to-text/realtime/turn-detection.mdx @@ -120,7 +120,7 @@ You can also use `ForceEndOfUtterance` with multi-channel diarization: } ``` -When this message is received, the server will send an [AddTranscript](../../api-ref/realtime-transcription-websocket#addtranscript) message, followed by an [EndOfUtterance](../../api-ref/realtime-transcription-websocket#endofutterance) message. +When this message is received, the server will send an [AddTranscript](../../api-ref/realtime-transcription-websocket#addtranscript) message, with a `forced` field to indicate it came from a ForceEndOfUtterance request, followed by an [EndOfUtterance](../../api-ref/realtime-transcription-websocket#endofutterance) message.. ## Semantic turn detection diff --git a/spec/realtime.yaml b/spec/realtime.yaml index 58be4291..424d8406 100644 --- a/spec/realtime.yaml +++ b/spec/realtime.yaml @@ -483,6 +483,10 @@ components: type: string description: | The channel identifier to which the audio belongs. This field is only seen in multichannel. + forced: + type: boolean + description: | + Indicates whether this message was triggered as a result of a ForceEndOfUtterance request. required: - message - metadata @@ -506,6 +510,10 @@ components: type: string description: | The channel identifier to which the audio belongs. This field is only seen in multichannel. + forced: + type: boolean + description: | + Indicates whether this message was triggered as a result of a ForceEndOfUtterance request. required: - message - metadata