摘要
近年来,随着深度神经网络技术的发展,人们提出了越来越多的深层网络结构并将其应用于语音分离任务。本文基于长短时记忆模型,研究了一种信噪分离算法,将梅尔频率倒谱系数作为模型的输入进行掩蔽估计,用Griffin-Lim算法重构分离语音。实验表明该算法的分离效果相比CNN方法有明显的提升,对阵发性噪声的分离效果尤其明显.
With the deep neural network technique development in recent years,increasing numbers of deep neural network structures have been proposed and widely used in speech separation. This paper studies a speech separation algorithm based on LSTM which takes MFCC as the input to carry out masking mask estimation and Griffin-Lim Signal Estimation Algorithm is used to reconstruct the separated speech. Experiment shows an obvious promotion in separation results comparing with CNN method,the separation effect of paroxysmal noise is especially obvious.
作者
王先宇
张二华
WANG Xianyu;ZHANG Erhua(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)
出处
《计算机与数字工程》
2022年第9期2037-2041,共5页
Computer & Digital Engineering