摘要
由于混响和噪声等因素影响,远距离场景下的语音识别性能较近场语音识别的性能仍有很大差距。为提高远场语音识别系统的性能,在后置滤波波束形成的基础上,提出结合深度神经网络与维纳后置滤波的前端增强远场语音识别方法。将维纳滤波嵌入神经网络对波束形成后的语音输出进行增强,抑制相关性噪声,最后利用TDNN-LSTM近场语音识别系统进行语音识别。实验在数据集CHiME-5上进行,结果表明,该方法对远场语音的识别结果优于传统的后置滤波方法,字错误率下降2.3%。
Because of the reverberation and background noise,there is still a gap between the far-field speech recognition and close-talk speech recognition.To improve the performance of far-field speech recognition,this paper proposes a deep neural network architecture combined with Wiener post-filer based on the post-filter beamforming.The speech is enhanced by method combining the neural networks and Wiener filter.TDNN-LSTM system is utilized to recognize the enhanced speech.The experiment is performed on the CHiME-5 dataset.The results show that the proposed method gains a 2.3%WER compared to the conventional method.
作者
刘诚然
宋潇潇
屈丹
杨绪魁
LIU Chengran;SONG Xiaoxiao;QU Dan;YANG Xukui(Information Engineering University, Zhengzhou 450001, China;Henan Information Center, Zhengzhou 450003, China)
出处
《信息工程大学学报》
2019年第4期405-409,416,共6页
Journal of Information Engineering University
基金
国家自然科学基金资助项目(61673395)。
关键词
远场语音识别
维纳后置滤波
深度神经网络
波束形成
far-field speech recognition
Wiener post-filter
deep neural networks
beamforming