期刊文献+

基于改进循环生成对抗神经网络的语音增强 被引量:2

Speech enhancement based on cycle-consistent generative adversarial network
下载PDF
导出
摘要 为克服基于生成对抗网络的语音增强技术存在成对语音样本缺乏的问题,提出改进的循环一致性生成对抗网络(CycleGAN)的不成对数据生成模型。通过引入2-1-2D CNN生成器和PatchGAN鉴别器,使改进的CycleGAN-2-1-2D模型能更有效地学习语音样本多维度的特征,并大大缩短了训练时长。选取LibriTTS语料库中的部分纯净语音作为训练集A,从语料库中选取其他样本加3种类型的噪声作为训练集B,训练集A和训练集B作为CycleGAN-2-1-2D模型的输入参数。设置CycleGAN-2D和NMF(nonnegative matrix factorization)的语音增强模型作为CycleGAN-2-1-2D模型的对照试验,通过仿真试验对3种模型生成的语音质量进行评估。研究数据表明:相较于NMF模型,CycleGAN-2-1-2D模型生成的语音质量有了较大的提升;相较于CycleGAN-2D模型,CycleGAN-2-1-2D模型对女声的增强效果有明显提升。 In order to overcome the lack of paired speech samples in speech enhancement technology based on generative adversarial networks,an improved unpaired data generation model of cycle-consistent generative adversarial networks(CycleGAN)is proposed.By introducing 2-1-2D CNN generator and PatchGAN discriminator,the CycleGAN-2-1-2D model can more effectively learn the multi-dimensional features of speech samples and greatly shorten the training time.Part of the pure speech in the LibriTTs corpus is selected as the training set A,other samples plus three types of noise are selected from the corpus as the training set B,training set A and training set B are used as input parameters of the CycleGAN-2-1-2D model.The models based on CycleGAN-2D and NMF are set as the control trial of CycleGAN-2-1-2D model,and the speech quality generated by the three models is evaluated by simulation experiments.The data analysis shows that compared with the NMF(nonnegative matrix factorization)method,the speech quality generated by the CycleGAN-2-1-2D model is better,compared with the CycleGAN-2D model,and the enhancement effect of CycleGAN-2-1-2D on female voice is significantly improved.
作者 徐珑婷 田娩鑫 魏郅林 XU Longting;TIAN Mianxin;WEI Zhilin(College of Information Science and Technology,Donghua University,Shanghai 201620,China)
出处 《东华大学学报(自然科学版)》 CAS 北大核心 2022年第5期70-76,共7页 Journal of Donghua University(Natural Science)
基金 上海市青年科技英才扬帆计划项目(19YF1402000) 国家自然科学青年基金项目(62001100)。
关键词 语音增强 深度神经网络 循环生成对抗网络 非平行数据 speech enhancement deep neural networks cycle-consistent generative adversarial networks non-parallel data
  • 相关文献

同被引文献22

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部