摘要
为实现噪声情况下的人声分离,提出了一种采用稀疏非负矩阵分解与深度吸引子网络的单通道人声分离算法。首先,通过训练得到人声与噪声的字典矩阵,将其作为先验信息从带噪混合语音中分离出人声与噪声的系数矩阵;然后,根据人声系数矩阵中不同的声源成分在嵌入空间中的相似性不同,使用深度吸引子网络将其分离为各声源语音的系数矩阵;最后,使用分离得到的各语音系数矩阵与人声的字典矩阵重构干净的分离语音。在不同噪声情况下的实验结果表明,本文算法能够在抑制背景噪声的同时提高分离语音的整体质量,优于结合声噪人声分离模型的对比算法。
The performance of monaural speech separation method is limited when the speech mixture is corrupted by background noise.To obtain the enhanced separated speeches from the noisy mixture,a monaural noisy speech separation method combining Sparse Non-negative Matrix Factorization(SNMF) and Deep Attractor Network(DANet)is proposed.This method firstly decomposes the noisy mixture into coefficients of speech and noise signal.Then the speech coefficient is projected to a high-dimensional embedding space and a DANet is trained to force the embeddings to move to different clusters.The attractor points are used to separate the speech coefficients by masking method,and finally the enhanced separated speeches are reconstructed by the speech basis and their corresponding coefficients.Experimental results in various background noise environments show that the proposed algorithm effectively suppress the noises without decreasing the speech quality of reconstructed speeches by comparison with different baseline methods.
作者
葛宛营
张天骐
范聪聪
张天
GE Wanying;ZHANG Tianqi;FAN Congcong;ZHANG Tian(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065)
出处
《声学学报》
EI
CAS
CSCD
北大核心
2021年第1期55-66,共12页
Acta Acustica
基金
国家自然科学基金项目(61671095,61371164,61702065,61701067,61771085)
信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003)
重庆市研究生科研创新项目(CYS17219)
重庆市教育委员会科研项目(KJ130524,KJ1600427,KJ1600429)资助。