期刊文献+

多尺度卷积的时频域语音分离方法研究 被引量:2

Speech separation in time-and-frequency domain based on multi-scale convolution
下载PDF
导出
摘要 在进行混合语音分离时,信号时域特征的深度学习语音分离性能优于频域特征。但目前时域特征的语音分离方法在真实噪声环境下的鲁棒性较差,且单一时域特征对分离模型的性能存在局限性。因此,提出一种基于Conv-TasNet网络的多特征语音分离方法,融合频域特征与时域特征,提高数据的多维信息。为了进一步提高分离网络性能,引入多尺度卷积块,提高网络对特征的提取能力。在包含真实噪声的实验环境下,所提方法与Conv-TasNet模型和最新的时频域融合语音分离基线模型相比,性能分别提高了0.91和0.52 dB,有效提升了语音分离的性能及鲁棒性。 In mixed speech separation, the performance of signal time-domain features is better than that of frequency-domain features. However, the current speech separation methods based on time domain feature have poor robustness in real noise environment, and single time domain feature has limitations on the performance of the separation model. Therefore, a multi-feature speech separation method based on Conv-TasNet network is proposed, which integrates frequency domain features and time domain features to improve multidimensional information of data. In order to further improve the performance of separation network, multi-scale convolution block is introduced to improve the feature extraction ability of network. Compared with the Conv-TasNet model and the latest time-frequency fusion speech separation baseline model, the performance and robustness of the proposed method are improved by 0.91 and 0.52 dB respectively in the experimental environment containing real noise.
作者 贾林锋 吴黎明 温腾腾 廖禹韬 高梓皓 Jia Linfeng;Wu Liming;Wen Tengteng;Liao Yutao;Gao Zihao(School of Electromechanical Engineering,Guangdong University of Technology,Guangzhou 510006,China)
出处 《电子测量与仪器学报》 CSCD 北大核心 2022年第11期134-140,共7页 Journal of Electronic Measurement and Instrumentation
基金 国家自然科学基金(61705045) 佛山广工大研究院创新创业人才团队计划项目(20191108)资助。
关键词 语音分离 特征融合 多尺度卷积 时频域特征 speech separation feature fusion multiscale convolution time-frequency domain characteristics
  • 相关文献

参考文献6

二级参考文献27

共引文献62

同被引文献9

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部