-
Notifications
You must be signed in to change notification settings - Fork 40
Add API request for generate audio using Microsoft Edge's online text-to-speech service #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
f59a4ee
889ac0f
109eebf
56699c1
80ea872
a134dd1
3c650ba
f2b1d0a
381f395
593eebb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -36,6 +36,7 @@ dependencies = [ | |
| "fastapi", | ||
| "pydantic", | ||
| "python-multipart", | ||
| "edge-tts", | ||
| ] | ||
|
|
||
| [project.urls] | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -15,4 +15,5 @@ loguru | |
| uvicorn | ||
| fastapi | ||
| pydantic | ||
| python-multipart | ||
| python-multipart | ||
| edge-tts | ||
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -3,6 +3,7 @@ | |||||||||||||||||||||||
| from fastapi.responses import Response, JSONResponse | ||||||||||||||||||||||||
| from loguru import logger | ||||||||||||||||||||||||
| from pydantic import BaseModel | ||||||||||||||||||||||||
| import edge_tts | ||||||||||||||||||||||||
| import tempfile | ||||||||||||||||||||||||
| import base64 | ||||||||||||||||||||||||
| import shutil | ||||||||||||||||||||||||
|
|
@@ -21,6 +22,13 @@ class SetParamsRequest(BaseModel): | |||||||||||||||||||||||
| class SetModelsDirRequest(BaseModel): | ||||||||||||||||||||||||
| models_dir: str | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| class TTSRequest(BaseModel): | ||||||||||||||||||||||||
| text: str | ||||||||||||||||||||||||
| voice: str | None = "Microsoft Server Speech Text to " | ||||||||||||||||||||||||
| rate: str | None = "+0%" | ||||||||||||||||||||||||
| volume: str | None = "+0%" | ||||||||||||||||||||||||
| pitch: str | None = "+0Hz" | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| def setup_routes(app: FastAPI): | ||||||||||||||||||||||||
| @app.post("/convert") | ||||||||||||||||||||||||
| def rvc_convert(request: ConvertAudioRequest): | ||||||||||||||||||||||||
|
|
@@ -117,6 +125,44 @@ def set_models_dir(request: SetModelsDirRequest): | |||||||||||||||||||||||
| except Exception as e: | ||||||||||||||||||||||||
| raise HTTPException(status_code=400, detail=str(e)) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| @app.get("/voices") | ||||||||||||||||||||||||
| async def list_voices(): | ||||||||||||||||||||||||
| return JSONResponse(content={"voices": await edge_tts.list_voices()}) | ||||||||||||||||||||||||
|
Comment on lines
+128
to
+130
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Add exception handling to the Currently, the Apply this diff to incorporate exception handling: @app.get("/voices")
async def list_voices():
+ try:
voices = await edge_tts.list_voices()
return JSONResponse(content={"voices": voices})
+ except Exception as e:
+ logger.error(f"Error retrieving voices: {e}")
+ raise HTTPException(status_code=500, detail="Failed to retrieve voices.") from eCommittable suggestion
Suggested change
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| @app.post("/tts") | ||||||||||||||||||||||||
| async def tts(request: TTSRequest): | ||||||||||||||||||||||||
| if not app.state.rvc.current_model: | ||||||||||||||||||||||||
| raise HTTPException(status_code=400, detail="No model loaded. Please load a model first.") | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| tmp_input = tempfile.NamedTemporaryFile(delete=False, suffix=".wav") | ||||||||||||||||||||||||
| tmp_output = tempfile.NamedTemporaryFile(delete=False, suffix=".wav") | ||||||||||||||||||||||||
| try: | ||||||||||||||||||||||||
| logger.info("Received request to generate audio by tts") | ||||||||||||||||||||||||
| input_path = tmp_input.name | ||||||||||||||||||||||||
| output_path = tmp_output.name | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| communicate = edge_tts.Communicate( | ||||||||||||||||||||||||
| text=request.text, | ||||||||||||||||||||||||
| voice=request.voice, | ||||||||||||||||||||||||
| rate=request.rate, | ||||||||||||||||||||||||
| volume=request.volume, | ||||||||||||||||||||||||
| pitch=request.pitch | ||||||||||||||||||||||||
| ) | ||||||||||||||||||||||||
| await communicate.save(input_path) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| app.state.rvc.infer_file(input_path, output_path) | ||||||||||||||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Avoid blocking the event loop with synchronous The Apply this diff to run - app.state.rvc.infer_file(input_path, output_path)
+ await asyncio.to_thread(app.state.rvc.infer_file, input_path, output_path)Committable suggestion
Suggested change
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| output_data = tmp_output.read() | ||||||||||||||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reset the file pointer before reading After writing to Apply this diff to reset the file pointer: tmp_output.close()
+ tmp_output = open(output_path, 'rb')
+ output_data = tmp_output.read()
+ tmp_output.close()Alternatively, seek to the beginning before reading: tmp_output.seek(0)
output_data = tmp_output.read()Committable suggestion
Suggested change
|
||||||||||||||||||||||||
| return Response(content=output_data, media_type="audio/wav") | ||||||||||||||||||||||||
| except Exception as e: | ||||||||||||||||||||||||
| logger.error(e) | ||||||||||||||||||||||||
| raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}") | ||||||||||||||||||||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Chain exceptions when raising When raising a new exception within an Apply this diff to chain exceptions: logger.error(e)
- raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}")
+ raise HTTPException(status_code=500, detail=f"An error occurred: {str(e)}") from eCommittable suggestion
Suggested change
ToolsRuff
|
||||||||||||||||||||||||
| finally: | ||||||||||||||||||||||||
| tmp_input.close() | ||||||||||||||||||||||||
| tmp_output.close() | ||||||||||||||||||||||||
| os.unlink(tmp_input.name) | ||||||||||||||||||||||||
| os.unlink(tmp_output.name) | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
| def create_app(): | ||||||||||||||||||||||||
| app = FastAPI() | ||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
|
|
||||||||||||||||||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complete the default value for
voiceinTTSRequestThe default value for
voiceappears incomplete:"Microsoft Server Speech Text to ". Please provide a complete and valid default voice identifier to ensure the TTS service functions correctly.Apply this diff to fix the default
voicevalue:Committable suggestion