Skip to content

RTX4060Ti being detected but not being used (Docker) #88

@KeepGood2016

Description

@KeepGood2016

I have an odd problem. I'm running the docker Orpheus FastAPI. The host system has a 16GB RTX4060Ti. It is being detected as a high performance GPU but it doesn't seem to be used. I can see system memory usage go up as the voice model is loaded in and when it received text for conversion I see CPU load increase. Once (and only once) I could see GPU RAM usage increase when the model loads in but there was still no GPU usage, the CPU workload still increased when receiving text. GPU sat near idle. The fastest I can get it to go is 0.84x real time by forcing the CPU to run at 4.6GHz constantly.

I've tried switching out the torch cuda version, 12.4, 12.6 and 12.8 all have the same results. The host operating system is Windows 10.

Does anyone have any idea why this could be happening?

Update:
So trying this on a fresh day, not touching the system at all. I've changed the model in the env file to use the Q8 model. It is now loading onto the GPU, using roughly 5.4GB VRAM. GPU usage while converting text jumps to 13% GPU utilization. I'm getting a whole 0.54x real time now apparently running from the GPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions