期刊文献+

基于卷积非负矩阵部分联合分解的强噪声单声道语音分离 被引量:3

Monaural Speech Separation by Means of Convolutive Nonnegative Matrix Partial Co-factorization in Low SNR Condition
下载PDF
导出
摘要 非负矩阵部分联合分解(Nonnegative matrix partial co-factorization,NMPCF)将指定源频谱作为边信息参与混合信号频谱的联合分解,以帮助确定指定源的基向量进而提高信号分离性能.卷积非负矩阵分解(Convolutive nonnegative matrix factorization,CNMF)采用卷积基分解的方法进行矩阵分解,在单声道语音分离方面取得较好的效果.为了实现强噪声条件下的语音分离,本文结合以上两种算法的优势,提出一种基于卷积非负矩阵部分联合分解(Convolutive nonnegative partial matrix co-factorization,CNMPCF)的单声道语音分离算法.本算法首先通过基音检测算法得到混合信号的语音起始点,再据此确定混合信号中的纯噪声段,最后将混合信号频谱和噪声频谱进行卷积非负矩阵部分联合分解,得到语音基矩阵,进而得到分离的语音频谱和时域信号.实验中,混合语音信噪比(Signal noise ratio,SNR)选择以¡3 dB为间隔从0 dB至¡12 dB共5种SNR.实验结果表明,在不同噪声类型和噪声强度条件下,本文提出的CNMPCF方法相比于以上两种方法均有不同程度的提高. Nonnegative matrix partial co-factorization(NMPCF) is a joint matrix decomposition algorithm integrating prior knowledge of specific source to help separate specific source signal from monaural mixtures.Convolutive nonnegative matrix factorization(CNMF),which introduces the concept of a convolutive non-negative basis set during NMF process,opens up an interesting avenue of research in the field of monaural sound separation.On the basis of the above two algorithms,we propose a speech separation algorithm named as convolutive nonnegative matrix partial co-factorization(CNMPCF) for low signal noise ratio(SNR) monaural speech.Firstly,through a voice detection process exploring fundamental frequency estimation algorithm,we divide a mixture signal into vocal and nonvocal parts,thus those vocal parts are used as test mixture signal while the nonvocal parts(pure noise) participat in the partial joint decomposition.After CNMPCF,we can obtain the separated speech spectrogram.Then,the separated speech signal can reconstructed through Inverse short time fourier transformation.In the experiments,we select 5 SNRs from 0 dB to-12 dB at-3 dB intervals to obtain low SNR mixture speeches.The results demonstrate that the proposed CNMPCF approach has superiority over sparse convolutive nonnegative matrix factorization(SCNMF) and NMPCF under different noise types and noise intensities.
作者 董兴磊 胡英 黄浩 吾守尔·斯拉木 DONG Xing-Lei;HU Ying;HUANG Hao;SILAMU Wushour(Department of Information Science and Engineering,Xin-jiang University,Urmuqi 830046;Laboratory of Multi-lingual Information Technology,Xinjiang University,Urumqi 830046)
出处 《自动化学报》 EI CSCD 北大核心 2020年第6期1200-1209,共10页 Acta Automatica Sinica
基金 国家自然科学基金(61761041,61663044) 国家自然科学基金青年基金(61603323) 新疆维吾尔自治区自然科学基金(2016D01C061) 新疆大学自然科学基金(BS160239) 新疆自治区高校科研计划项目(XJ EDU2017T002)资助。
关键词 卷积非负矩阵分解 非负矩阵部分联合分解 语音分离 强噪声 单声道 Convolutive nonnegative matrix factorization(CNMF) nonnegative matrix partial co-factorization(NM-PCF) speech separation strong noise monaural speech
  • 相关文献

参考文献4

二级参考文献75

  • 1邹霞,陈亮,张雄伟.基于Gamma语音模型的语音增强算法[J].通信学报,2006,27(10):118-123. 被引量:11
  • 2Kim G, Lu Y, Hu Y, Loizou P C. An algorithm that im- proves speech intelligibility in noise for normal-hearing lis- teners. The Journal of the Acoustical Society of America, 2009, 126(3): 1486-1494.
  • 3Dillon H. Hearing Aids. New York: Thieme, 2001.
  • 4Allen J B. Articulation and intelligibility. Synthesis Lectures on Speech and Audio Processing, 2005, 1(1): 1-124.
  • 5Seltzer M L, Raj B, Stern R M. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Communication, 2004, 43(4): 379-393.
  • 6Weninger F, Erdogan H, Watanabe S, Vincent E, Le Roux J, Hershey J R, Schuller B. Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR. In: Proceedings of the 12th International Conference on Latent Variable Analysis and Signal Separation. Liberec, Czech Republic: Springer International Publishing, 2015.91 -99.
  • 7Weng C, Yu D, Seltzer M L, Droppo J. Deep neural networks for single-channel multi-talker speech recognition. IEEE/ ACM Transactions on Audio, Speech, and Language Pro- cessing, 2015, 23(10): 1670-1679.
  • 8Boll S F. Suppression of acoustic noise in speech using spec- tral subtraction. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1979, 27(2): 113-120.
  • 9Chen J D, Benesty J, Huang Y T, Doclo S. New insights into the noise reduction wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, 2006, 14(4): 1218 -1234.
  • 10Loizou P C. Speech Enhancement: Theory and Practice. New York: CRC Press, 2007.

共引文献111

同被引文献16

引证文献3

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部