Title

U2VC: One-Shot Voice Conversion using Two-Level Nested U-Structure

Description

This page is for audio demo samples of our works. The demo samples show that our proposed approach improves the naturalness of the converted whlile preserving the similarity between the converted speech and the target speech. It has to be noted that the model is only trained with VCTK corpus. All the speakers speak English in this corpus.

Demo

Seen-to-seen conversion

Case	Source audio	Target voice	Converted audio
Case	Source audio	Target voice	Propose	AGAIN-VC	AdaIN-VC
Male to Male
Male to Female
Female to Male
Female to Female

Uneen-to-unseen conversion

Case	Source audio	Target voice	Converted audio
Case	Source audio	Target voice	Propose	AGAIN-VC	AdaIN-VC
Male to Male
Male to Female
Female to Male
Female to Female

Cross-lingual conversion

VCC2020 corpus is used for cross-lingual conversion, Mandarin is mainly used for the evaluation because the raters are (Mandarin-English) bilingual speakers. Other language samples are also shown here.

Case	Source audio	Target voice	Converted audio
Case	Source audio	Target voice	Propose	AGAIN-VC	AdaIN-VC
Mandarin to English
Mandarin to English
Mandarin to English
Mandarin to English
English to Mandarin
English to Mandarin
English to Mandarin
English to Mandarin
Finnish to Finnish
German to English
Mandarin to German
English to Finnish