(feat:model/cosyvoice):use new protocol to build websocket connection #188

songguocola · 2026-02-05T03:43:34Z

No description provided.

gemini-code-assist · 2026-02-05T03:43:49Z

Summary of Changes

Hello @songguocola, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request fundamentally changes how audio services, particularly speech synthesis, establish and manage their network connections. By introducing a dedicated WebSocket protocol layer, it aims to provide a more robust and efficient mechanism for real-time audio data exchange. This refactoring centralizes WebSocket logic, making the SpeechSynthesizerV2 more modular and testable, and sets the foundation for improved streaming capabilities.

Highlights

New WebSocket Protocol Implementation: Introduced AudioWebsocketCallback and AudioWebsocketRequest classes to establish and manage WebSocket connections for audio-related services, providing a standardized way to handle real-time audio data.
SpeechSynthesizerV2 Integration: The SpeechSynthesizerV2 class has been refactored to utilize this new WebSocket protocol for real-time speech synthesis, handling connection, message exchange, and event processing for TTS tasks.
Enhanced Testing for Streaming: A new unit test, TestTtsV2SpeechSynthesizerV2, was added to validate the streaming speech synthesis functionality using a mock WebSocket server, ensuring the reliability and correctness of the new protocol integration.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

src/main/java/com/alibaba/dashscope/audio/protocol/AudioWebsocketCallback.java
- Added a new interface defining callbacks for WebSocket events (onOpen, onMessage, onError, onClose).
src/main/java/com/alibaba/dashscope/audio/protocol/AudioWebsocketRequest.java
- Added a new class extending WebSocketListener to handle WebSocket connection lifecycle, message sending (text and binary), and status management.
- Includes methods for connecting, closing, and sending messages, with timeout handling for connection.
src/main/java/com/alibaba/dashscope/audio/ttsv2/SpeechSynthesizerV2.java
- Added a new class implementing AudioWebsocketCallback for speech synthesis.
- Integrates AudioWebsocketRequest to manage the WebSocket connection for TTS.
- Implements methods for starting, continuing, and finishing speech synthesis tasks via WebSocket messages.
- Handles incoming text and binary messages from the WebSocket, processing speech synthesis results and audio frames.
src/test/java/com/alibaba/dashscope/TestTtsV2SpeechSynthesizerV2.java
- Added a new JUnit 5 test class to verify the SpeechSynthesizerV2's streaming capabilities.
- Uses MockWebServer to simulate WebSocket server behavior, sending task-started, task-finished, and binary audio messages.
- Asserts that audio frames are correctly received and accumulated during a streaming call.

Activity

The pull request was created by songguocola.
No additional comments or reviews have been made on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new WebSocket protocol for CosyVoice speech synthesis. While a security audit found no vulnerabilities of medium severity or higher, the code review identified several areas for improvement. These include concerns regarding thread safety, exception handling (e.g., potential null pointer exceptions and improper InterruptedException handling), and critical flaws in the test implementation. Addressing these issues will enhance the robustness and reliability of the new functionality.

src/test/java/com/alibaba/dashscope/TestTtsV2SpeechSynthesizerV2.java

gemini-code-assist · 2026-02-05T03:46:57Z

src/main/java/com/alibaba/dashscope/audio/protocol/AudioWebsocketRequest.java

+  public void onClosed(WebSocket webSocket, int code, String reason) {
+    isOpen.set(false);
+    isClosed.set(true);
+    connectLatch.get().countDown();


The call to connectLatch.get() could result in a NullPointerException if onClosed is called before connect() has initialized the latch. It's safer to add a null check, similar to what's done in onFailure.

Suggested change

connectLatch.get().countDown();

if (connectLatch.get() != null) {

connectLatch.get().countDown();

}

src/main/java/com/alibaba/dashscope/audio/protocol/AudioWebsocketRequest.java

gemini-code-assist · 2026-02-05T03:46:57Z

src/main/java/com/alibaba/dashscope/audio/ttsv2/SpeechSynthesizerV2.java

+      } catch (InterruptedException ignored) {
+        log.error("Interrupted while waiting for streaming complete");
+      }


Catching InterruptedException and not restoring the thread's interrupted status can lead to subtle and hard-to-debug issues, especially in concurrent code. The interrupted flag should be restored by calling Thread.currentThread().interrupt(). This is handled correctly in other parts of this class, like in streamingCall.

} catch (InterruptedException e) { log.error("Interrupted while waiting for streaming complete", e); Thread.currentThread().interrupt(); }

src/test/java/com/alibaba/dashscope/TestTtsV2SpeechSynthesizerV2.java

src/main/java/com/alibaba/dashscope/audio/protocol/AudioWebsocketRequest.java

gemini-code-assist · 2026-02-05T03:46:57Z

src/main/java/com/alibaba/dashscope/audio/ttsv2/SpeechSynthesizerV2.java

+   * Accumulates audio data to audioStream. Reuses existing buffer when possible to minimize
+   * allocations.
+   */
+  private void accumulateAudioData(ByteBuffer frame) throws Exception {


The method signature declares throws Exception, which is too broad. The only checked exception that can be thrown from the method body is IOException from audioStream.write(). It's better to declare the specific exception, throws IOException, for better error handling and code clarity.

Suggested change

private void accumulateAudioData(ByteBuffer frame) throws Exception {

private void accumulateAudioData(ByteBuffer frame) throws java.io.IOException {

gemini-code-assist · 2026-02-05T03:46:57Z

src/main/java/com/alibaba/dashscope/audio/ttsv2/SpeechSynthesizerV2.java

+      callback.onComplete();
+    }
+    if (audioStream != null) {
+      audioStream.reset(); // 重置 ByteArrayOutputStream，清空数据但保留缓冲区


The comment is in Chinese, while the rest of the code and comments are in English. For consistency and to make it understandable for a wider audience, please translate the comment to English.

Suggested change

audioStream.reset(); // 重置 ByteArrayOutputStream，清空数据但保留缓冲区

audioStream.reset(); // Reset ByteArrayOutputStream, clear data but keep the buffer

gemini-code-assist · 2026-02-05T03:46:58Z

src/test/java/com/alibaba/dashscope/TestTtsV2SpeechSynthesizerV2.java

+                  public void onMessage(WebSocket webSocket, String string) {
+                    System.out.println("mock server recv: " + string);
+                    JsonObject req = JsonUtils.parse(string);
+                    if (task_id == "") {


For clarity and to follow best practices, use task_id.isEmpty() instead of task_id == "" to check for an empty string.

Suggested change

if (task_id == "") {

if (task_id.isEmpty()) {

(feat:model/cosyvoice):use new protocol to build websocket connection

f98318d

gemini-code-assist bot reviewed Feb 5, 2026

View reviewed changes

songguocola closed this Feb 5, 2026

songguocola deleted the dev/cosyvoice_v2 branch February 5, 2026 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat:model/cosyvoice):use new protocol to build websocket connection #188

(feat:model/cosyvoice):use new protocol to build websocket connection #188

Uh oh!

songguocola commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

gemini-code-assist bot Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	private void accumulateAudioData(ByteBuffer frame) throws Exception {
	private void accumulateAudioData(ByteBuffer frame) throws java.io.IOException {

	audioStream.reset(); // 重置 ByteArrayOutputStream，清空数据但保留缓冲区
	audioStream.reset(); // Reset ByteArrayOutputStream, clear data but keep the buffer

(feat:model/cosyvoice):use new protocol to build websocket connection #188

(feat:model/cosyvoice):use new protocol to build websocket connection #188

Uh oh!

Conversation

songguocola commented Feb 5, 2026

Uh oh!

gemini-code-assist bot commented Feb 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant