Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

生成模型维度转换 #5

Open
qazwsx921028 opened this issue Jul 22, 2024 · 4 comments
Open

生成模型维度转换 #5

qazwsx921028 opened this issue Jul 22, 2024 · 4 comments

Comments

@qazwsx921028
Copy link

生成的模型是se.pth: torch.Size([1, 256, 1]),怎么转换成tts可以用的torch.Size([768])呢?
我尝试转换了一下,但是复刻的是女生,但是出来的声音是男生

@HKoon
Copy link
Owner

HKoon commented Jul 22, 2024

是不能直接转换的,因为实际上是用了两个模型,模型与模型之间其实传的音频文件,两个模型的网络结构和参数是不能共用的

@HKoon
Copy link
Owner

HKoon commented Jul 22, 2024

前级模型是ChatTTS的文字转语音模型,这一步想要稳定输出可以固定一个发音人seed,后级模型是OpenVoice的音色转换模型,se是它提取出来的音频的音色特征,这个特征是只能被音色转换模型识别并使用的

@qazwsx921028
Copy link
Author

如果我想让chattts使用克隆后的音色特征的话,我应该怎么修改呀?有没有可行的思路呢?

@HKoon
Copy link
Owner

HKoon commented Jul 22, 2024

现在就是会自动应用克隆后的音色特征的,变得不像很有可能是ChatTTS的音色差异太大了,你可以通过ChatTTS的发音人seed来控制声线克隆之前的原声声线。原声音色尽量保持稳定。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants