Microphone Array Speech Separation Algorithm Based on TC-ResNet

下载PDF

导出

摘要 Traditional separation methods have limited ability to handle the speech separation problem in high reverberant and low signal-to-noise ratio(SNR)environments,and thus achieve unsatisfactory results.In this study,a convolutional neural network with temporal convolution and residual network(TC-ResNet)is proposed to realize speech separation in a complex acoustic environment.A simplified steered-response power phase transform,denoted as GSRP-PHAT,is employed to reduce the computational cost.The extracted features are reshaped to a special tensor as the system inputs and implements temporal convolution,which not only enlarges the receptive field of the convolution layer but also significantly reduces the network computational cost.Residual blocks are used to combine multiresolution features and accelerate the training procedure.A modified ideal ratio mask is applied as the training target.Simulation results demonstrate that the proposed microphone array speech separation algorithm based on TC-ResNet achieves a better performance in terms of distortion ratio,source-to-interference ratio,and short-time objective intelligibility in low SNR and high reverberant environments,particularly in untrained situations.This indicates that the proposed method has generalization to untrained conditions.

作者 Lin Zhou Yue Xu Tianyi Wang Kun Feng Jingang Shi

机构地区 School of Information Science and Engineering Center for Machine Vision and Signal Analysis

出处《Computers, Materials & Continua》 SCIE EI 2021年第11期2705-2716,共12页 计算机、材料和连续体（英文）

基金 This work is supported by the National Key Research and Development Program of China under Grant 2020YFC2004003 and Grant 2020YFC2004002 the National Nature Science Foundation of China(NSFC)under Grant No.61571106.

关键词 Residual networks temporal convolution neural networks speech separation

分类号 TN9 [电子电信—信息与通信工程]

引文网络
相关文献

参考文献4

1Lin Zhou,Siyuan Lu,Qiuyue Zhong,Ying Chen,Yibin Tang,Yan Zhou.Binaural Speech Separation Algorithm Based on Long and Short Time Memory Networks[J].Computers, Materials & Continua,2020(6):1373-1386. 被引量：1
2Sheng Liu,Hailin Cao,Decheng Wu,Xiyuan Chen.Generalized Array Architecture with Multiple Sub-Arrays and Hole-Repair Algorithm for DOA Estimation[J].Computers, Materials & Continua,2020(7):589-605. 被引量：1
3Xiaoyan Zhao,Shuwen Chen,Lin Zhou,Ying Chen.Sound Source Localization Based on SRP-PHAT Spatial Spectrum and Deep Neural Network[J].Computers, Materials & Continua,2020(7):253-271. 被引量：3
4Kang Yang,Jielin Jiang,Zhaoqing Pan.Mixed Noise Removal by Residual Learning of Deep CNN[J].Journal of New Media,2020,2(1):1-10. 被引量：1

二级参考文献2

1孙伟峰,彭玉华.一种改进的非局部平均去噪方法[J].电子学报,2010,38(4):923-928. 被引量：33
2ZHAO XiaoYan,TANG Jie,ZHOU Lin,WU ZhenYang.Accelerated steered response power method for sound source localization via clustering search[J].Science China(Physics,Mechanics & Astronomy),2013,56(7):1329-1338. 被引量：5

共引文献2

1黄静,胡馨月.基于麦克风阵列的室内三维声源定位优化算法[J].计算机系统应用,2021,30(9):212-218. 被引量：7
2ZHAO Dada,DING Kai,QI Xiaogang,CHEN Yu,FENG Hailin.Sound event localization and detection based on deep learning[J].Journal of Systems Engineering and Electronics,2024,35(2):294-301.

1Lin Zhou,Siyuan Lu,Qiuyue Zhong,Ying Chen,Yibin Tang,Yan Zhou.Binaural Speech Separation Algorithm Based on Long and Short Time Memory Networks[J].Computers, Materials & Continua,2020(6):1373-1386. 被引量：1
2蓝天,彭川,李森,钱宇欣,陈聪,刘峤.基于RefineNet的端到端语音增强方法[J].自动化学报,2022,48(2):554-563. 被引量：3
3王苏婉,周卫京.我国英语可理解度研究管窥[J].英语教师,2022,22(5):8-12.
4储有亮,李梁.基于DBLSTM-DCNN的骨导和气导语音转换[J].声学技术,2021,40(6):815-821.
5徐浩森,姜囡,齐志坤.基于注意力机制的卷积循环网络语音降噪[J].科学技术与工程,2022,22(5):1950-1957. 被引量：9
6李园园,周明章,孙海信,冯晓,应文威.水声JANUS信号的分数低阶时频谱迁移学习识别方法[J].声学学报,2022,47(4):461-470. 被引量：1
7郑颖,邓灵莉,李劲夫,卓灵,游奇琳,范怀瑾,冯文江.基于时序偏移双残差网络的窃电行为检测[J].西南师范大学学报（自然科学版）,2022,47(8):54-63. 被引量：2
8沈正一,汪晴.一种基于子阵重构的三元线阵左右舷分辨方法[J].声学与电子工程,2021(4):31-34.
9王晓玲,王栋,任炳昱,陈文夫,谭尧升,关涛.高拱坝混凝土振捣机器人系统研发及应用[J].水利学报,2022,53(6):631-643. 被引量：6
10Hong Liu,Peipei Yuan,Bing Yang,Ge Yang,Yang Chen.Head-related transfer function–reserved time-frequency masking for robust binaural sound source localization[J].CAAI Transactions on Intelligence Technology,2022,7(1):26-33. 被引量：2

Computers, Materials & Continua

2021年第11期

浏览历史

内容加载中请稍等...

Microphone Array Speech Separation Algorithm Based on TC-ResNet

参考文献4

二级参考文献2

共引文献2

相关作者

相关机构

相关主题

浏览历史