期刊文献+

融合跨说话人韵律迁移的多语种文本到波形生成

Multilingual text-to-waveform with cross-speaker prosody transfer
下载PDF
导出
摘要 在多语种语音合成任务中,由于单人多语种数据稀缺,让一个音色同时支持多种语言合成变得非常困难。不同于已有方法仅在声学模型中解耦音色和发音,提出一种融合跨说话人韵律迁移的端到端多语种语音合成方法,采用两级层级条件变分自编码器直接建模从文本到波形的生成过程,并解耦了音色、发音和韵律等信息。该方法通过迁移目标语种已有说话人的韵律风格来改善跨语种合成的韵律。实验表明,所提模型在跨语种语音生成上获得了3.91和4.01的自然度和相似度平均意见得分,相比基线跨语种合成字错误率降低到5.85%。韵律迁移以及消融实验也进一步证明了该方法的有效性。 For the multilingual speech synthesis task,due to the scarcity of single-person multilingual data,it becomes very difficult for one voice to support multilingual synthesis at the same time.Unlike previous methods that only decouple timbre and pronunciation within acoustic models,this paper proposes an end-to-end multilingual speech synthesis method that incorporates cross-speaker prosody transfer,which uses a two-level hierarchical conditional variational auto-encoder to directly model the generation process from text-to-waveform and decouples timbre,pronunciation,and prosody.The method improves the prosody of cross-lingual synthesis by transferring the prosody style of existing speakers in the target language.Experiments reveal that the proposed model achieves an average opinion score of 3.91 and 4.01 for naturalness and similarity in cross-lingual speech generation.Objective indicators also show that the word error rate of this method is reduced to 5.85%compared with baselines.Besides,prosody transfer and ablation experiments further prove the effectiveness of proposed method.
作者 尚增强 张鹏远 王丽 SHANG Zengqiang;ZHANG Pengyuan;WANG Li(Key Laboratory of Speech Acoustics and Content Understanding,Institute of Acoustics,Chinese Academy of Sciences,Beijing 100190;University of Chinese Academy of Sciences,Beijing 100049)
出处 《声学学报》 EI CAS CSCD 北大核心 2024年第1期171-180,共10页 Acta Acustica
基金 国家重点研发计划(2021YFC3320102,2021YFC3320103)资助。
关键词 多语种语音合成 韵律迁移 变分自编码器 韵律解耦 Multilingual speech synthesis Prosody transfer Variational auto-encoder Prosody decouple
  • 相关文献

参考文献7

二级参考文献68

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部