Time-frequency mask estimation-based speech enhancement using deep encoder-decoder neural network

导出

摘要 A time-frequency mask estimation using deep encoder-decoder neural network for speech enhancement is presented.The mask estimation is learned implicitly by the deep encoder-decoder neural network and.jointed with the time-frequency representation of the noisy speech to learn the nonlinear mapping function between the noisy and target speech.The deep encoder-decoder neural network employs convolution and de-nonvolution structure.The convolution encoder makes use of the local perception characteristic of convolution network to model the typical structural features of noisy speech in the time-frequency domain.Speech features are extracted and the influence of background noise is suppressed.At the decoder end,the speech signal is reconstructed from the extracted speech features in the encoder end and the local details are recovered layer by layer.Meanwhile,skip connections are introduced between homologous layers to circumvent the low level details losing problem induced by pooling and down-sampling operations.Experiments are conducted on the TIMIT dataset and the results demonstrate that the proposed method can effectively suppress noise and recover the detailed information of speech.

作者 SHI Wenhua ZHANG Xiongwei ZOU Xia SUN Meng LI Li REN Zhengbing

机构地区 Army Engineering University Beijing Aeronautical Technology Research Center First Military Representation Office of Air Force Equipment Department

出处《Chinese Journal of Acoustics》 CSCD 2021年第1期141-154,共14页 声学学报（英文版）

基金 supported by the National Natural Science Foundation of China (61471394,62071484) the Natural Science Foundation of Jiangsu Province for Excellent Young Scholars (BK20180080)。

关键词 DECODER network NEURAL

分类号 TN912.35 [电子电信—通信与信息系统] TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献2

1闵刚,邹霞,韩伟,张雄伟,谭薇.用于无监督语音降噪的听觉感知鲁棒主成分分析法[J].声学学报,2017,42(2):246-256. 被引量：4
2袁文浩,孙文珠,夏斌,欧世峰.利用深度卷积神经网络提高未知噪声下的语音增强性能[J].自动化学报,2018,44(4):751-759. 被引量：36

二级参考文献5

1吴红卫,俞一彪,吴镇扬.基于Laplace-Gauss模型和简化相位判别的离散余弦变换域语音增强[J].声学学报,2008,33(3):244-251. 被引量：4
2杨琳,张建平,颜永红.单通道语音增强算法对汉语语音可懂度影响的研究[J].声学学报,2010,35(2):248-253. 被引量：17
3HUANG Jianjun,ZHANG Xiongwei,ZHANG Yafei,ZOU Xia.Single channel speech enhancement via time-frequency dictionary learning[J].Chinese Journal of Acoustics,2013,32(1):90-102. 被引量：6
4吴迪,陶智,张晓俊,周燕,潘欣裕,肖仲喆,赵鹤鸣.感知听觉场景分析的说话人识别[J].声学学报,2016,41(2):260-272. 被引量：4
5刘文举,聂帅,梁山,张学良.基于深度学习语音分离技术的研究现状与进展[J].自动化学报,2016,42(6):819-833. 被引量：67

共引文献37

1连海伦,周健,胡雨婷,郑文明.利用深度卷积神经网络将耳语转换为正常语音[J].声学学报,2020,45(1):137-144. 被引量：8
2志东.鲁棒性语音识别技术研究综述[J].信息通信,2019,0(11):20-22. 被引量：1
3任晓霞.基于Dropout深度卷积神经网络的ST段波形分类算法[J].传感技术学报,2018,31(8):1217-1222. 被引量：10
4刘亚,王静,田新诚.基于C#和Matlab混合编程的轴承故障诊断系统[J].计算机应用,2018,38(A02):236-238. 被引量：12
5袁文浩,娄迎曦,梁春燕,王志强.感知联合优化的深度神经网络语音增强方法[J].西安电子科技大学学报,2019,46(2):89-94. 被引量：4
6罗秀芝,马本学,李小霞,胡洋洋,王文霞,雷声渊.基于卷积神经网络干制哈密大枣纹理分级[J].新疆农业科学,2018,55(12):2220-2227. 被引量：5
7姚红革,沈新霞,李宇,喻钧,雷松泽.多模态融合的深度学习脑肿瘤检测方法[J].光子学报,2019,48(7):159-170. 被引量：3
8袁文浩,梁春燕,夏斌.基于深度神经网络的因果形式语音增强模型[J].计算机工程,2019,45(8):255-259. 被引量：4
9韦博轩,张冀聪.EEG及MEG痫样棘波检测算法研究现状[J].中国医疗设备,2019,34(11):30-33.
10陈郑平,米为民,林静怀,王恒,王昊,董根源.电网调控操作智能助手方案探讨[J].电力系统自动化,2019,43(22):173-178. 被引量：26

1本刊英文版2017年60卷第10期摘要（英文）[J].中国科学：数学,2017,47(10).
2Yang-Jie Cao,Shuang Wu,Chang Liu,Nan Lin,Yuan Wang,Cong Yang,Jie Li.Seg-CapNet:A Capsule-Based Neural Network for the Segmentation of Left Ventricle from Cardiac Magnetic Resonance Imaging[J].Journal of Computer Science & Technology,2021,36(2):323-333. 被引量：3
3Hesam Akbari,Sedigheh Ghofrani.Empirical Wavelet Transform;Stationary and Nonstationary Signals[J].Journal of Electronic & Information Systems,2019,1(2):1-5. 被引量：1
4Ziyang Li,Feng Hu,Chilong Wang,Weibin Deng,Qinghua Zhang.Selective kernel networks for weakly supervised relation extraction[J].CAAI Transactions on Intelligence Technology,2021,6(2):224-234. 被引量：3
5柏顺,颜夕宏,张生平,陈建飞,张胜.基于梅尔频率倒谱系数与短时能量的低信噪比语音端点检测[J].南京师大学报（自然科学版）,2021,44(2):117-120. 被引量：7
6韩野,匡玉庭,朱东明.FJX1基因在实体肿瘤中的功能研究进展[J].江苏大学学报（医学版）,2021,31(2):101-103. 被引量：1
7Haithem Trabelsi,Abdennour Seibi,Ning Liu,Fathi Boukadi,Racha Trabelsi.Bridge Plug Drillouts Cleaning Practices—An Overview[J].Natural Resources,2021,12(2):19-33.
8Li Xiaoyang.When Women Pull The Purse Strings Sheconomy surges as females show greater financial clout and independence[J].ChinAfrica,2021,13(5):44-45.
9周仕承,刘京菊,钟晓峰,卢灿举.基于深度强化学习的智能化渗透测试路径发现[J].计算机科学,2021,48(7):40-46. 被引量：12
10王妤,陈秀新,袁和金.基于改进Faster RCNN的变电站红外图像多目标识别[J].传感技术学报,2021,34(4):522-530. 被引量：11

Chinese Journal of Acoustics

2021年第1期

浏览历史

内容加载中请稍等...

Time-frequency mask estimation-based speech enhancement using deep encoder-decoder neural network

参考文献2

二级参考文献5

共引文献37

相关作者

相关机构

相关主题

浏览历史