摘要
本文提出了一种联合长短时记忆递归神经网络和非负矩阵分解方法对单通道语音进行混响消除;对语音信号的对数功率谱建模抑制混响干扰。首先通过长短时记忆递归神经网络估计对数功率谱,这种模型结构能捕获整个音频序列的信息重构纯净语音的对数功率谱,然后通过非负矩阵分解方法对重构的对数功率谱进行后处理抑制过平滑问题;实验结果表明所提方法可以有效抑制语音信号中的混响干扰,本文方法的各种性能指标优于基线方法。
This paper presents a two stages speech dereverberation method which combine the bidirectional Long Short Term Memory( BLSTM) recurrent neural network with non-negative matrix factorization( NMF) for a single channel. The log power spectra is selected as features to suppress the reverberation. The BLSTM-RNN which can capture information from anywhere in the feature sequence is used to dereverberated log power spectra firstly and NMF which could alleviate the oversmoothing problem is applied to generated log power spectra in the second stage. Experimental results demonstrate that the proposed method could achieve significant improvements over the different baseline methods.
出处
《信号处理》
CSCD
北大核心
2017年第3期268-272,共5页
Journal of Signal Processing
基金
中国国家高技术研究发展计划(863计划)(2015AA016305)
国家自然科学基金(61425017
61403386
61305003
61233009
61273288)
国家社会科学基金重大项目(13&ZD189)
中国科学院先导专项(Grant XDB02080006)
关键词
单通道混响消除
长短时记忆递归神经网络
非负矩阵分解
深度学习
single channel based dereverberation
long short term memory recurrent neural network
nonnegative matrix factorization
deep learning