Title
U2VC: One-Shot Voice Conversion using Two-Level Nested U-Structure
Description
This page is for audio demo samples of our works. The demo samples show that our proposed approach improves the naturalness of the converted whlile preserving the similarity between the converted speech and the target speech. It has to be noted that the model is only trained with VCTK corpus. All the speakers speak English in this corpus.
Demo
Seen-to-seen conversion
Case | Source audio | Target voice | Converted audio | ||
Propose | AGAIN-VC | AdaIN-VC | |||
Male to Male | |||||
Male to Female | |||||
Female to Male | |||||
Female to Female |
Uneen-to-unseen conversion
Case | Source audio | Target voice | Converted audio | ||
Propose | AGAIN-VC | AdaIN-VC | |||
Male to Male | |||||
Male to Female | |||||
Female to Male | |||||
Female to Female |
Cross-lingual conversion
VCC2020 corpus is used for cross-lingual conversion, Mandarin is mainly used for the evaluation because the raters are (Mandarin-English) bilingual speakers. Other language samples are also shown here.
Case | Source audio | Target voice | Converted audio | ||
Propose | AGAIN-VC | AdaIN-VC | |||
Mandarin to English | |||||
Mandarin to English | |||||
Mandarin to English | |||||
Mandarin to English | |||||
English to Mandarin | |||||
English to Mandarin | |||||
English to Mandarin | |||||
English to Mandarin | |||||
Finnish to Finnish | |||||
German to English | |||||
Mandarin to German | |||||
English to Finnish |