结合优化U⁃Net和残差神经网络的单通道语音增强算法被引量：2

Single channel speech enhancement algorithm combining optimized U⁃Net and residual network

下载PDF

导出

摘要语音增强的目的是从带噪语音中恢复出干净的语音信号,为了解决现有深度神经网络中语音增强算法不稳定,语音增强效果不理想的问题,提出一种改进的U⁃Net网络与残差神经网络相结合的语音增强算法。首先,该方法构建了一个基于U⁃Net网络的端到端的语音增强模型;然后在该模型的编解码块中引入残差单元,将残差神经网络结构的跨层连接和拟合残差项应用到模型训练中,该方法更有利于恢复目标语音的细节特征信息,增强了模型训练的稳定性,提高了模型的特征提取能力和训练效率,改进后的Residual⁃U⁃Net网络模型能够实现更优的语音增强效果。仿真实验结果表明:与现有的其他几种语音增强方法相比,文中所提出的Residual⁃U⁃Net算法更有效地实现了语音增强,此外,该算法具有良好的去噪效果,进一步提高了语音信号的质量及其可懂度。 The purpose of speech enhancement is to recover the clean speech signal from the speech with noise.In order to solve the problems that the speech enhancement algorithms in the existing deep neural network(DNN)are unstable and their speech enhancement effects are not ideal,an improved speech enhancement algorithm based on the combination of U⁃Net network and residual network(ResNet)is proposed.An end⁃to⁃end speech enhancement model based on U⁃Net network is constructed,and then the residual unit is introduced into the codec block of the model,and the cross layer connection of ResNet structure and fitting residual term are applied to model training.This method is more conducive to recovering the detailed feature information of target speech,enhancing the stability of model training,and improving the feature extraction ability and training efficiency of the model.The improved Residual⁃U⁃Net network model can achieve better speech enhancement effect.The results of the simulation experiment show that,in comparison with the other existing speech enhancement algorithms,the proposed Residual⁃U⁃Net algorithm can achieve speech enhancement more effectively,has better denoising effect,and further improves the quality and intelligibility of speech signal.

作者许春冬徐琅周滨 XU Chundong;XU Lang;ZHOU Bin(School of Information Engineering,Jiangxi University of Science and Technology,Ganzhou 341000,China)

机构地区江西理工大学信息工程学院

出处《现代电子技术》 2022年第9期35-40,共6页 Modern Electronics Technique

基金国家自然科学基金项目(11864016) 国家自然科学基金项目(61671442) 江西省文化艺术科学规划项目一般项目(YG2017384)。

关键词语音增强深层神经网络 U⁃Net 残差神经网络跨层连接模型训练残差单元引入特征提取 speech enhancement DNN U⁃Net ResNet cross layer connection model training residual unit introduction feature extraction

分类号 TN912.35-34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献4

1刘文举,聂帅,梁山,张学良.基于深度学习语音分离技术的研究现状与进展[J].自动化学报,2016,42(6):819-833. 被引量：69
2时文华,倪永婧,张雄伟,邹霞,孙蒙,闵刚.联合稀疏非负矩阵分解和神经网络的语音增强[J].计算机研究与发展,2018,55(11):2430-2438. 被引量：9
3袁文浩,孙文珠,夏斌,欧世峰.利用深度卷积神经网络提高未知噪声下的语音增强性能[J].自动化学报,2018,44(4):751-759. 被引量：38
4褚晶辉,李晓川,张佳祺,吕卫.一种基于级联卷积网络的三维脑肿瘤精细分割[J].激光与光电子学进展,2019,56(10):67-76. 被引量：26

二级参考文献77

1Kim G, Lu Y, Hu Y, Loizou P C. An algorithm that im- proves speech intelligibility in noise for normal-hearing lis- teners. The Journal of the Acoustical Society of America, 2009, 126(3): 1486-1494.
2Dillon H. Hearing Aids. New York: Thieme, 2001.
3Allen J B. Articulation and intelligibility. Synthesis Lectures on Speech and Audio Processing, 2005, 1(1): 1-124.
4Seltzer M L, Raj B, Stern R M. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Communication, 2004, 43(4): 379-393.
5Weninger F, Erdogan H, Watanabe S, Vincent E, Le Roux J, Hershey J R, Schuller B. Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR. In: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation. Liberec, Czech Republic: Springer International Publishing, 2015.91 -99.
6Weng C, Yu D, Seltzer M L, Droppo J. Deep neural networks for single-channel multi-talker speech recognition. IEEE/ ACM Transactions on Audio, Speech, and Language Pro- cessing, 2015, 23(10): 1670-1679.
7Boll S F. Suppression of acoustic noise in speech using spec- tral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2): 113-120.
8Chen J D, Benesty J, Huang Y T, Doclo S. New insights into the noise reduction wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1218 -1234.
9Loizou P C. Speech Enhancement: Theory and Practice. New York: CRC Press, 2007.
10Liang S, Liu W J, Jiang W. A new Bayesian method incor- porating with local correlation for IBM estimation. IEEE Transactions on Audio, Speech, and Language Processing, 2013, 21(3): 476-487.

共引文献125

1李艳生,刘园,张毅,杨美美.混响环境下移动机器人语音控制方法及系统实现[J].仪器仪表学报,2019,40(11):165-171. 被引量：14
2连海伦,周健,胡雨婷,郑文明.利用深度卷积神经网络将耳语转换为正常语音[J].声学学报,2020,45(1):137-144. 被引量：8
3志东.鲁棒性语音识别技术研究综述[J].信息通信,2019,0(11):20-22. 被引量：1
4杨海龙,曾祥福,钟维良.多尺度时域单通道语音分离网络设计[J].电声技术,2021,45(10):96-99.
5孙劲光,陈倩.融合多层级特征的脑肿瘤图像分割方法[J].光电子．激光,2022,33(11):1215-1224.
6黄张翼,周翊,舒晓峰,刘宏清.联合贝叶斯估计与深度神经网络的语音增强方法[J].小型微型计算机系统,2019,40(1):40-44. 被引量：5
7吕菲,夏秀渝.基于方位特征的听觉选择性注意计算模型研究[J].自动化学报,2017,43(4):634-644. 被引量：5
8支艳利,张云伟.基于环形麦克风阵列的远场语音识别系统[J].微型电脑应用,2017,33(4):62-64. 被引量：2
9王程,周婉,何军.面向自动音乐生成的深度递归神经网络方法[J].小型微型计算机系统,2017,38(10):2412-2416. 被引量：14
10袁文浩,孙文珠,夏斌,欧世峰.利用深度卷积神经网络提高未知噪声下的语音增强性能[J].自动化学报,2018,44(4):751-759. 被引量：38

同被引文献18

1张雄伟,郑昌艳,曹铁勇,杨吉斌,邢益搏.骨导麦克风语音盲增强技术研究现状及展望[J].数据采集与处理,2018,33(5):769-778. 被引量：4
2郑昌艳,张雄伟,曹铁勇,杨吉斌,孙蒙,邢益搏.一种基于LSTM-RNN的喉振传声器语音盲增强算法[J].数据采集与处理,2019,34(4):615-624. 被引量：6
3孙晓,丁小龙.基于生成对抗网络的人脸表情数据增强方法[J].计算机工程与应用,2020,56(4):115-121. 被引量：26
4刘航,李扬,袁浩期,王俊影.基于生成对抗网络的语音信号分离[J].计算机工程,2020,46(1):302-308. 被引量：6
5刘坤,文熙,黄闽茗,杨欣欣,毛经坤.基于生成对抗网络的太阳能电池缺陷增强方法[J].浙江大学学报（工学版）,2020,54(4):684-693. 被引量：5
6胡年宗,伍世虔,张亦明.基于卷积神经网络的SLAM回环检测算法研究[J].计算机仿真,2020,37(5):282-286. 被引量：7
7彭晏飞,杜婷婷,高艺,訾玲玲,桑雨.基于条件生成对抗网络的低照度遥感图像增强[J].激光与光电子学进展,2020,57(14):144-152. 被引量：7
8夏鼎,徐文涛.基于生成对抗网络合成噪声的语音增强方法研究[J].电子技术应用,2020,46(11):56-59. 被引量：5
9刘雅琴,甘文丽.一种基于谱减法的语音增强算法研究[J].微型电脑应用,2020,36(12):56-57. 被引量：3
10郑昌艳,杨吉斌,张雄伟,孙蒙.在波形网络中融合相位信息的骨导语音增强[J].声学学报,2021,46(2):309-320. 被引量：4

引证文献2

1张玥,张雄伟,孙蒙.基于时频注意力机制与U-Net的骨导语音鲁棒增强方法[J].信号处理,2022,38(10):2134-2143.
2胡嘉欣,田军.基于生成对抗网络的数字音频信号多声道增强方法[J].现代电子技术,2023,46(19):41-44. 被引量：2

二级引证文献2

1武田甜,李静.噪声环境中的双声道音频回波抵消模型构建[J].现代电子技术,2023,46(24):24-28.
2罗永剑.基于深度自适应小波网络的移动通信网络传输信号增强方法[J].长江信息通信,2024,37(8):178-181.

1陆仲达,张春达,张佳奇,王子菲,许军华.双分支网络的苹果叶部病害识别[J].计算机科学与探索,2022,16(4):917-926. 被引量：5
2王杉,胡艺莹,丰亮,郭林英.基于跨路径特征聚合的改进型YOLOv3乳腺肿块识别算法[J].激光与光电子学进展,2022,59(4):71-80. 被引量：1
3吴君钦,王迎福.一种改进窗函数的低时延语音增强算法[J].计算机仿真,2022,39(2):203-211. 被引量：2
4罗勇江,杨腾飞,赵冬.色噪声下基于白化频谱重排鲁棒主成分分析的语音增强算法[J].电子与信息学报,2021,43(12):3671-3679. 被引量：5
5王棣星,李江伟,陶清瑞.部分相对论效应对北斗原子钟性能影响分析[J].测绘科学,2022,47(3):29-36.
6刘卓,付中华.基于交叉注意力机制的波束形成后置滤波网络[J].计算机应用研究,2022,39(5):1444-1448.
7刘建鑫,赵刚,周月婷,周晓彬,马维光.高反射腔镜双折射效应对腔增强光谱技术的影响[J].物理学报,2022,71(8):124-129.

现代电子技术

2022年第9期

浏览历史

内容加载中请稍等...

结合优化U⁃Net和残差神经网络的单通道语音增强算法被引量：2

参考文献4

二级参考文献77

共引文献125

同被引文献18

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

结合优化U⁃Net和残差神经网络的单通道语音增强算法 被引量：2

参考文献4

二级参考文献77

共引文献125

同被引文献18

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

结合优化U⁃Net和残差神经网络的单通道语音增强算法被引量：2