Title

U2VC: One-Shot Voice Conversion using Two-Level Nested U-Structure

Description

This page is for audio demo samples of our works. The demo samples show that our proposed approach improves the naturalness of the converted whlile preserving the similarity between the converted speech and the target speech. It has to be noted that the model is only trained with VCTK corpus. All the speakers speak English in this corpus.

Demo

Seen-to-seen conversion

Case Source audio Target voice Converted audio
Propose AGAIN-VC AdaIN-VC
Male to Male
Male to Female
Female to Male
Female to Female

Uneen-to-unseen conversion

Case Source audio Target voice Converted audio
Propose AGAIN-VC AdaIN-VC
Male to Male
Male to Female
Female to Male
Female to Female

Cross-lingual conversion

VCC2020 corpus is used for cross-lingual conversion, Mandarin is mainly used for the evaluation because the raters are (Mandarin-English) bilingual speakers. Other language samples are also shown here.

Case Source audio Target voice Converted audio
Propose AGAIN-VC AdaIN-VC
Mandarin to English
Mandarin to English
Mandarin to English
Mandarin to English
English to Mandarin
English to Mandarin
English to Mandarin
English to Mandarin
Finnish to Finnish
German to English
Mandarin to German
English to Finnish