摘要
在现代通信系统中,回波与混响常损害通信语音的质量和可懂度。为克服回波与混响的负面影响,本文提出了一种基于深度学习的两阶段联合声学回波和混响抑制系统。系统先用基于理想比值掩蔽的模型去除与目标信号不相关的声学回波;然后用一个基于"隐掩蔽"的谱映射模型去除与目标信号强相关的混响干扰;最后联合训练两阶段模型以获得更好的系统性能。一系列不同声学环境下的实验结果表明,本文所提出的系统可显著地消除回波与混响干扰,从而极大地增强了目标语音的语音质量与可懂度。
In modern telecommunications,both echo and reverberation can significantly disturb people’s communication and degrade the speech intelligibility and quality.In order to overcome the negative impact of the echo and reverberation simultaneously,we proposed a two-stage joint-training system based on deep learning to enhance the speech signal,where echo cancellation and speech dereverberation were conducted sequentially.The system is composed of two stages,echo cancellation stage and dereverberation stage.The system firstly employed a model based on ideal ratio mask(IRM)to cancel the acoustic echo,which was uncorrelated with the target signal.Then the reverberation strongly correlated with the target signal was removed using a spectrum mapping model combined with a hidden mask.Then the two-stage model was jointly trained to obtain a better performance.A series of systematic experiments were conducted in different conditions and the results indicated that the proposed system significantly improves the performance on echo cancellation and dereverberation and achieves better speech intelligibility and quality over other methods.
作者
栾书明
程龙彪
孙兴伟
李军锋
颜永红
Luan Shuming;Cheng Longbiao;Sun Xingwei;Li Junfeng;Yan Yonghong(Key Laboratory of Speech Acoustic and Content Understanding,Institute of Acoustic,Chinese Academy of Sciences,Beijing 100190,China;University of Chinese Academy of Sciences,Beijing 100049,China;Xinjiang Laboratory of Minority Speech and Language Information Processing,Xinjiang Technical,Institute of Physics and Chemistry,Chinese Academy of Sciences,Urumqi 830011,China)
出处
《信号处理》
CSCD
北大核心
2020年第6期948-957,共10页
Journal of Signal Processing
基金
国家重点研究开发计划项目(2017YFB1002803)
国家自然科学基金项目(11722437,11674352)。
关键词
回波消除
去混响
双向长短时记忆网络
理想比率掩蔽
联合训练
谱映射
acoustic echo cancellation
dereverberation
bidirectional long short-term memory
ideal ratio mask
joint training
spectrum mapping