Given a new input and a target sample, is it possible to make use of a pretrained models to do voice conversion.
As the embedding in the vocoder has to be learnt, I was considering to just train the vocoder to learn embeddings for the new target speaker and then use the convert.py to get the voice converted output. Can it be done this way? If not, please do suggest ways on how to do it.
Thanks,