含语音增强模块的i-向量说话人识别性能分析

Speech enhancement ini-vector speaker verification system

下载PDF

导出

摘要为解决文本无关说话人识别中训练与识别环境不同导致模式失配的问题,提出了一种采用语音增强模块进行前端预处理的i-向量说话人识别系统,从而提高系统对于环境噪声的鲁棒性.为评估不同语音增强算法的性能,利用NIST08核心测试集进行仿真实验.采用IMCRA算法对语音进行噪声估计后,分别用维纳滤波法、MMSE-LSA、传统谱减法和多频带谱减法等4种方法进行语音增强前端处理,在基于i-向量的说话人识别系统下进行实验.实验结果表明采用了语音增强的系统具有一定抗噪声性能,并且在高信噪比条件下,基于多频带的谱减法在此系统下性能最佳,而低信噪比情况下MMSE-LSA算法更有优势. To solve the model-mismatch problem in text-independent speaker verification system when training environment differs from recognition environment,We propose a i-vector speaker verification system using speech enhancement in front-end preprocessing it can improve the system robustness to additive noise. To estimate the performance of different speech enhancement methods,we used NIST08 core test set in the experiment. Four speech enhancement methods,including wiener filtering,MMSE-LSA,traditional spectral subtraction and multi-band spectral subtraction,combining with IMCRA noise estimation,were evaluated in the speaker verification system based on i-vector. The result shows the proposed system with speech enhancement had some improvement in noise environment and that multi-band spectral subtraction method performed the best when SNR was relatively high and MMSE-LSA performed the best when SNR was low.

作者李昕李为游寒旭朱杰

机构地区上海交通大学电子信息与电气工程学院

出处《上海师范大学学报（自然科学版）》 2016年第2期237-242,共6页 Journal of Shanghai Normal University(Natural Sciences)

基金国家自然科学基金(61271349 61371147 11433002) 上海交通大学医工合作基金(YG2012ZD04)

关键词说话人识别 i-向量语音增强维纳滤波 MMSE 谱减法 speaker verification i-vector speech enhancement wiener filtering MMSE spectral subtraction method

分类号 TN912.32 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1Zhang Q. Research on target speaker identification system under noise environment [ D ]. Wuhan : Wuhan Textile Universi- ty,2012.
2Togneri R, Pullella D. An overview of speaker identification : Accuracy and robustness issues [ J ]. Circuits and Systems Magazine IEEE,2011,11 (2) :23 - 61.
3Kenny P. Joint factor analysis of speaker and session variability : Theory and algorithms [ R ]. Montreal: CRIM ,2005.
4Dehak N, Kenny P, Dehak R, et al. Front-end factor analysis for speaker verification [ J ]. Audio, Speech, and Language Processing, IEEE Transactions on, 2011,19 (4) :788 - 798.
5Hatch A O, Kajarekar S S, Stolcke A. Within-class covariance normalization for SVM-based speaker recognition [ C ]// DBLP. 1NTERSPEECH 2006 and 9th International Conference on Spoken Language Processing-ICSLP. Pittsburgh: DBLP,2006.
6Kenny P. Bayesian Speaker Verification with Heavy-Tailed Priors [ C ]//ISCA. Proceedings of the Odyssey Speaker and Language Recognition Workshop. Bruno : ISCA ,2010.
7Cohen I. Noise spectrum estimation in adverse environments:Improved minima controlled recursive averaging [ J ]. Speech and Audio Processing IEEE Transactions on,2003,11 (5) :466 -475.
8Ephraim Y. A minimum mean square error approach for speech enhancement [ C ]//IEEE. Acoustics Speech and Signal Processing. Albuquerque : IEEE, 1990.
9Cohen I. Speech enhancement using a noncausal a priori SNR estimator [ J ]. Signal Processing Letters, IEEE,2004,11 (9) :725 -728.
10Berouti M, Schwartz R, Makhoul J. Enhancement of speech corrupted by acoustic noise [ C ]//IEEE. Acoustics Speech and Signal Processing IEEE International Conference on ICASSP'79. Washington, D. C :IEEE, 1979.

二级参考文献7

1Berouti M,Schwartz R,Makhoul J.Enhancement of speech corrupted by acoustic noise[C]//Proceedings of IEEE International Conference on Acoustic,Speech,Signal Processing.[S.l.]:lEEE Press,1979, 208-211.
2Radu Mihnea Udrea,Silviu Ciochina,Dragos Nicolae Vizireanu. Multi-band bark scale spectral over-subtraction for colored noise reduction[C]//ISSCS 2005:International Symposium on Signals,Circuits and Systems-Proceedings, 2005,311-314.
3Kamath S,Loizou P.A multi-band spectral subtraction method for enhancing speech corrupted by colored noise [C]//IEEE Trans A-coust,Speech Signal Process,2002.
4He C,Zweig G.A daptive two-band spectral subtraction with multi-window spectral estimation[C]//ICASSP, 1999,2:793-796.
5Ghanbari,Yasser,Mollaei K,et al.lmproved multi-band spectral subtraction method for speech enhancement [C]//the Sixth IASTED International Conference on Signal and Image Processing,2004, 225-230.
6Wu K,Chen P.Efficient speech enhancement using spectral sub- traction for ear hands-free application[C]//International Conference on Consumer Electronics,2001,2:220-221.
7Hansen J,Pellom B.An effective quality evaluation protocol for speech enhancements algorithms [C]//Inter Conf on Spoken Language Processing, 1998,7 : 2819-2822.

共引文献5

1邢永涛,付中华,张艳宁.二维维纳滤波语音增强方法研究与实现[J].计算机工程与应用,2009,45(19):137-138. 被引量：2
2牟海维,张芙蓉.基于听觉掩蔽效应的多带谱减语音增强算法[J].大庆石油学院学报,2009,33(5):103-106. 被引量：1
3曹亮,张天骐,高洪兴,易琛.基于听觉掩蔽效应的多频带谱减语音增强方法[J].计算机工程与设计,2013,34(1):235-240. 被引量：9
4万义龙,张天骐,王志朝,金静.基于多频带谱减法的抗噪声语音识别研究[J].电视技术,2013,37(23):183-187. 被引量：5
5胡金艳,李盛,陈扶明,张宇婷,谢欣欣,王健琪.基于多带谱减法的生物雷达语音增强方法研究[J].科学技术与工程,2017,17(16):76-81. 被引量：1

1晏光华.一种基于MMSE-LSA和VAD的语音增强算法[J].移动通信,2014,38(10):59-62. 被引量：2
2赵洋.基于正交小波变换和谱减法的语音增强研究[J].科技创新导报,2009,6(6):120-121.
3杨波,王新房.基于非因果先验信噪比估计的语音增强改进算法[J].计算机系统应用,2012,21(7):200-202. 被引量：3
4张鹏,张艳宁,付中华,张亚娟.基于MMSE-LSA语音增强算法在非平稳环境下的研究与实现[J].计算机工程与设计,2007,28(19):4695-4697. 被引量：6
5陈红梅,陈健.基于短时对数谱的MMSE语音增强算法研究[J].重庆邮电学院学报（自然科学版）,2004,16(3):65-68. 被引量：1
6王金明,周坤,尹海明,徐志军.一种改进的LSA语音增强算法[J].解放军理工大学学报（自然科学版）,2015,16(4):310-315.
7李世绍,高勇.低信噪比下基于FastIca和MMSE-LSA的语音识别[J].电声技术,2014,38(1):62-65. 被引量：1
8陈芳丽,王璇,王雨,王欢.基于短时谱估计的三种语音增强技术降噪效果的比较[J].科协论坛（下半月）,2009(8):53-54.
9谭中奇,龙兴武.模式失配对连续波腔衰荡技术测量的影响[J].中国激光,2007,34(7):962-966. 被引量：6
10程嫚嫚,顾明亮,张浩.噪声环境下语音增强的算法分析与研究[J].信息化研究,2015,41(1):29-34.

上海师范大学学报（自然科学版）

2016年第2期

浏览历史

内容加载中请稍等...

含语音增强模块的i-向量说话人识别性能分析

参考文献12

二级参考文献7

共引文献5

相关作者

相关机构

相关主题

浏览历史