Maximum Likelihood A Priori Knowledge Interpolation-Based Handset Mismatch Compensation for Robust Speaker Identification

Maximum Likelihood A Priori Knowledge Interpolation-Based Handset Mismatch Compensation for Robust Speaker Identification

原文传递

导出

摘要 Unseen handset mismatch is the major source of performance degradation in speaker identification in telecommunication environments. To alleviate the problem, a maximum likelihood a priori knowledge interpolation （ML-AKI）-based handset mismatch compensation approach is proposed. It first collects a set of handset characteristics of seen handsets to use as the a priori knowledge for representing the space of handsets. During evaluation the characteristics of an unknown test handset are optimally estimated by interpolation from the set of the a priori knowledge. Experimental results on the HTIMIT database show that the ML-AKI method can improve the average speaker identification rate from 60.0% to 74.6% as compared with conventional maximum a posteriori-adapted Gaussian mixture models. The proposed ML-AKI method is a promising method for robust speaker identification. Unseen handset mismatch is the major source of performance degradation in speaker identification in telecommunication environments. To alleviate the problem, a maximum likelihood a priori knowledge interpolation （ML-AKI）-based handset mismatch compensation approach is proposed. It first collects a set of handset characteristics of seen handsets to use as the a priori knowledge for representing the space of handsets. During evaluation the characteristics of an unknown test handset are optimally estimated by interpolation from the set of the a priori knowledge. Experimental results on the HTIMIT database show that the ML-AKI method can improve the average speaker identification rate from 60.0% to 74.6% as compared with conventional maximum a posteriori-adapted Gaussian mixture models. The proposed ML-AKI method is a promising method for robust speaker identification.

作者廖元甫庄智显杨智合

机构地区 Department of Electronic Engineering Department of Communication Engineering

出处《Tsinghua Science and Technology》 SCIE EI CAS 2008年第4期528-532,共5页 清华大学学报（自然科学版（英文版）

基金 the Science Council of Taiwan, China (No. NSC95-2221-E-027-102)

关键词 robust speaker identification maximum likelihood estimation handset mismatch compensation Gaussian mixture model maximum a posteriori robust speaker identification maximum likelihood estimation handset mismatch compensation Gaussian mixture model maximum a posteriori

分类号 TP24 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

参考文献9

1Mak M W,Tsang C L,Kung S Y.Stochastic feature trans- formation with divergence-based out-of-handset rejection for robust speaker verification[].EURASIP J on Applied Signal Processing.2004
2Teunen R,Shahshahani B,Heck L P.A model based trans- formational approach to robust speaker recognition[].Proc ICSLP.2000
3Yang Jyhher,Liao Yuanfu.Unseen handset mismatch compensation based on feature/model-space a priori knowledge interpolation for robust speaker recognition[].Proc of ISCSLP’.2004
4Reynolds D A.HTIMIT and LLHDB: Speech corpora for the study of handset transducer effects[].Proc ICASSP’.1997
5Kuhn H W,Tucker A W.Nonlinear programming[].Pro- ceedings of nd Berkeley Symposium.1951
6Spellucci P.DONLP2[]..2008
7Dempster A P,Laird N M,Rubin D B.Maximum Likelihood from Incomplete Data via the EM Algorithm (with Discussion)[].JRoyStatSocSer.1977
8C J Leggetter,P C Woodland.Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[].Computer Speech and Language.1995
9Reynolds D A,Quatieri T F,Dinn R B.Speaker verification using adapted gaussian mixture models[].Digital Signal Processing.2000

1陈茂国.MRCPv2应用于实时连续语音识别的研究[J].科技创业月刊,2016,29(3):122-124. 被引量：1
2Zeng Yumin,Wu Zhenyang.COMBINATION OF PITCH SYNCHRONOUS ANALYSIS AND FISHER CRITERION FOR SPEAKER IDENTIFICATION[J].Journal of Electronics(China),2007,24(6):828-834.
3李星亮.声纹识别技术应用的关键问题[J].中国科技博览,2009(1):50-50.
4石可箴.车载多媒体系统中语音识别技术研究[J].数字技术与应用,2012,30(2):82-82. 被引量：4
5曹文,张劲松.Tone-3 Accent Realization in Short Chinese Sentences[J].Tsinghua Science and Technology,2008,13(4):533-539. 被引量：1
6陈智圣.手机功放发展趋势探讨[J].电子产品世界,2005,12(09A):64-64.
7XU Longting,YANG Zhen,SUN Linhui.Simplification of I-Vector Extraction for Speaker Identification[J].Chinese Journal of Electronics,2016,25(6):1121-1126. 被引量：4
8张劲松,Takatoshi Jitsuhiro,Hirofumi Yamamoto,胡新辉,Satoshi Nakamura.An Introduction to the Chinese Speech Recognition Front-End of the NICT/ATR Multi-Lingual Speech Translation System[J].Tsinghua Science and Technology,2008,13(4):545-552. 被引量：3
9仲元昌,陈辉,丁漩,王旭.多天线RFID阅读器的多标签识别及其可靠性分析[J].高技术通讯,2011,21(11):1190-1195. 被引量：4
10GU Xiaojiang ZHAO Heming Lu Gang.Whispered speaker identification based on feature and model hybrid compensation[J].Chinese Journal of Acoustics,2012,31(4):499-508. 被引量：1

Tsinghua Science and Technology

2008年第4期

浏览历史

内容加载中请稍等...

Maximum Likelihood A Priori Knowledge Interpolation-Based Handset Mismatch Compensation for Robust Speaker Identification

参考文献9

相关作者

相关机构

相关主题

浏览历史