Skip to content

Conversation

@Peggy0422
Copy link

To support audio product with TTS function, several operation should be done:

Added TTSCapabilities(Optional): indicate whether the device is capable of TTS function and its corresponding TTS configuration. So add complex type "TTSCapabilities" to the existing complex type "AudioClipCapabilities".
Parameter:

  1. MaxContentLength: indicates the max length of content of a text for device to convert to an audio clip;
  2. TTSLanguage: indicates what language(s) the device supports for TTS function.
  3. TTSVoiceType: indicates voice types that device supports for TTS function.
  1. Add “AddTTSAudioClip”and "AddTTSAudioClipResponse": to send a text, TTS configuration and audio clip configuration to device, device could convert the text to an audio clip based on TTS Configuration. Subsequently, the device will play this audio clip based on configuration.
    Parameter:
  1. Token(Optional): token for the audio clip.
  2. Configuration: audio clip configuration to add, see element "Configuration" .
  3. TTSConfiguration: for TTS audio clip, it specifies the audio content, language and voice type when device play this audio clip.
    Reponse:
  4. Token: unique token of the TTS audio clip to be uploaded.

media2.wsdl

  1. Updated complexType "AudioClipCapabilities" with element "TTSCapabilities"; added complexType "TTSCapabilities" with attributes "MaxContentLength", "TTSLanguage" and "TTSVoiceType"; added simpleType "TTSLanguage" and "TTSVoiceType".
  2. Added elements "AddTTSAudioClip" and "AddTTSAudioClipResponse" for sending a text, TTS configuration and audio clip configuration to the device.
  3. Added complexType "TTSAudio" for element "TTSConfiguration". It includes parameters such as Content, Language, VoiceType.
  4. Added "AddTTSAudioClipRequest" and "AddTTSAudioClipResponse"

media2.xml and documentation

  1. Added detail descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.
  2. Updated audio clip capabilities with TTSCapabilities.

1. Added AddTTSAudioClip request and AddTTSAudioClip response for sending a text and its TTS configuration to the device(1621-1652)(2036-2041)(2418-2422)(2935-2943).
2. Added complex types "TTS Audio" (1465-1485)for TTSConfiguration to support TTS function. It includes parameters Content, Language, VoiceType.
3. updated AudioClipCapabilities with TTSCapabilities(177-181), and added complex types for TTSCapabilities(201-220)to indicate the device supports TTS function and its corresponding configuration. 
complex types TTSCapabilities includes MaxContentLength, TTSLanguage and TTSVoiceType.
4. Added simpleType TTSLanguage(220-231) and TTSVoiceType(232-238).
1. Added detailed descriptions for AddTTSAudioClip operations, explaining their purpose, parameters, and responses.(2359-2416)
2. updated audio clip Capabilities with TTSCapabilities.(2698-2700)
update code line information for TTS function
correct some editorial errors
Updated the description of the AddTTSAudioClip operation to clarify the parameters and response. Updated the description of TTScapabilities.
TTS audio clip pull request was firstly created as number 668
Updated TTS configuration description and added TTSCapabilities entry.
@sujithhanwha
Copy link
Contributor

OLD PR for reference
#668

@ocampana-videotec ocampana-videotec added this to the 26.06 milestone Dec 4, 2025
doc/Media2.xml Outdated
</varlistentry>
</variablelist>
<para></para>
<para><emphasis role="bold">Note:</emphasis> Audio clip uploads to the device can fail in the following scenarios, and a specific HTTP error code should be returned to the client when an upload fails.</para>
Copy link
Contributor

@venki5685 venki5685 Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this note seems not applicable for TTSAudioClip

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is not for TTS, I will delete it.

delete inappropriate note for OPTION AddTTSAudioClip
Copy link
Contributor

@johado johado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small textual comments.

doc/Media2.xml Outdated
<title>AddTTSAudioClip</title>
<para>This operation adds a text, audio clip configuration and TTS configuration to the device, for device converting the text to an audio clip based on the TTS configuration.
The response to the command includes a unique token for this converted audio clip.
If the device is unable to support language specified in the TTS configuration, the associated configuration will deleted from the device.</para>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add "be" to "will be deleted"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, got it.

<term>response</term>
<listitem>
<para role="param">Token - [tt:ReferenceToken]</para>
<para role="text">Unique token of the TTS audio clip to be uploaded.</para>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change "to be uploaded" to "that was added" ?

doc/Media2.xml Outdated
</varlistentry>
<varlistentry>
<term>TTSCapabilities</term>
<listitem><para>Indicates device supports TTS function and TTS configuration.See tr2: TTSCapabilities.</para></listitem>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add space after .: "..configuration. See tr2:..."

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thank you.

</xs:element>
<xs:element name="Language" type="xs:string">
<xs:annotation>
<xs:documentation>Language for the TTS audio clip playback. See tr2: TTSLanguage. </xs:documentation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to "See tr2:TTSLanguage and TTSCapabilities." ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your option. TTSLanguage is an attribute within TTSCapability already. If we want to point out that the language for TTS audio clip playback must be one of the languages that supported by the device, we could consider revise the explanation to clearly indicate this, such as: "The language which is supported and used for TTS audio clip playback. "

</xs:element>
<xs:element name="VoiceType" type="xs:string">
<xs:annotation>
<xs:documentation>The voice type for the TTS audio clip playback. See tr2: TTSVoiceType.</xs:documentation>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to "See tr2:TTSVoiceType and TTSCapabilities." ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to update the explanation for TTSVoiceType, just like commit for TTSLanguage

<xs:sequence>
<xs:element name="Token" type="tt:ReferenceToken">
<xs:annotation>
<xs:documentation>Unique token of the TTS audio clip to be uploaded.</xs:documentation>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change "to be uploaded" to something more relevant. converted, generated, ..?

<xs:anyAttribute processContents="lax"/>
</xs:complexType>
<!--===============TTS Language================-->
<xs:simpleType name="TTSLanguage">

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is reasoning behind decision of languages in below list?

Copy link

@robberos robberos Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any standard for offical language names that can be refered to?

TTSCapabilities and TTSAudio uses open strings, so enum should provide a good pattern.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc/Media2.xml Outdated
</itemizedlist>
</section>
</section>
<section xml:id="section_wvd_dzg_rye">
Copy link

@robberos robberos Dec 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

id should be unique in xml, right? seems as it is a copy of SetAudioClip section below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for the suggestion. I have revised it accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants