摘要
近年来,随着深度学习的发展,深层模型被越来越多的学者用于语音分离。其中,以深度神经网络(DeepNeuralNetworks,DNN)为代表的深度学习在语音分离领域表现出了强大的优势。为了更好的提高目标语音的质量,我们提出一种基于组合DNN的语音分离方法(CE_DNN)。首先把两种不同的训练集放入DNN中进行训练,得到了两种不同参数的DNN训练模型,然后将测试数据放入两种训练模型后得到的输出结果进行结合,并且将不同类型的噪声与纯净语音进行混合,再配以噪声的不同输入信噪比进行试验。实验结果表明,与DNN语音分离系统相比,CE_DNN不仅可以很好的提高理想二值掩蔽(IBM)中的HIT-FA指标(命中率-误报率),还可以提高语音目标的短时客观语音可懂度(STOI)。
In recent years, with the development of deep learning, deep model is used by more and more scholars for speech separation. Among them, Deep Neural Networks (DNN) have shown great advantages in the field of speech separation. In order to improve the quality of the speech target, we propose a speech separation method based on a combined DNN (CE_DNN). First, two different training sets are put into the DNN for training, and two DNN training models with different parameters are obtained. Then the test data is combined into the output results obtained after the two training models are combined, the different types of noise are mixed with the pure speech, and the noise is mixed with different input signal-to-noise ratios test. Experimental results show that compared with the existing DNN speech separation system, CE_DNN can not only improve the HIT-FA index (hit rate - false alarm rate) in ideal binary masking (IBM), but also improve the speech target Short-term objective speech intelligibility (STOI).
作者
闵长伟
江华
闫格
冯利琪
MIN Changwei;JIANG Hua;YAN Ge;FENG Liqi(Key Laboratory of Granular Computing and Application, Minnan Normal University, Zhangzhou, 363000, China;School of Computer Science, Minnan Normal University, Zhangzhou, 363000, China)
出处
《数码设计》
2018年第4期77-84,共8页
Peak Data Science
基金
国家自然科学基金项目(No.61472406)
福建省自然科学基金项目(No.2015J01269,No.2016J01304)
闽南师范大学人才引进项目
b永州市科技局项目[(2015)7]。