期刊文献+

Maximum Likelihood A Priori Knowledge Interpolation-Based Handset Mismatch Compensation for Robust Speaker Identification

Maximum Likelihood A Priori Knowledge Interpolation-Based Handset Mismatch Compensation for Robust Speaker Identification
原文传递
导出
摘要 Unseen handset mismatch is the major source of performance degradation in speaker identification in telecommunication environments. To alleviate the problem, a maximum likelihood a priori knowledge interpolation (ML-AKI)-based handset mismatch compensation approach is proposed. It first collects a set of handset characteristics of seen handsets to use as the a priori knowledge for representing the space of handsets. During evaluation the characteristics of an unknown test handset are optimally estimated by interpolation from the set of the a priori knowledge. Experimental results on the HTIMIT database show that the ML-AKI method can improve the average speaker identification rate from 60.0% to 74.6% as compared with conventional maximum a posteriori-adapted Gaussian mixture models. The proposed ML-AKI method is a promising method for robust speaker identification. Unseen handset mismatch is the major source of performance degradation in speaker identification in telecommunication environments. To alleviate the problem, a maximum likelihood a priori knowledge interpolation (ML-AKI)-based handset mismatch compensation approach is proposed. It first collects a set of handset characteristics of seen handsets to use as the a priori knowledge for representing the space of handsets. During evaluation the characteristics of an unknown test handset are optimally estimated by interpolation from the set of the a priori knowledge. Experimental results on the HTIMIT database show that the ML-AKI method can improve the average speaker identification rate from 60.0% to 74.6% as compared with conventional maximum a posteriori-adapted Gaussian mixture models. The proposed ML-AKI method is a promising method for robust speaker identification.
出处 《Tsinghua Science and Technology》 SCIE EI CAS 2008年第4期528-532,共5页 清华大学学报(自然科学版(英文版)
基金 the Science Council of Taiwan, China (No. NSC95-2221-E-027-102)
关键词 robust speaker identification maximum likelihood estimation handset mismatch compensation Gaussian mixture model maximum a posteriori robust speaker identification maximum likelihood estimation handset mismatch compensation Gaussian mixture model maximum a posteriori
  • 相关文献

参考文献9

  • 1Mak M W,Tsang C L,Kung S Y.Stochastic feature trans- formation with divergence-based out-of-handset rejection for robust speaker verification[].EURASIP J on Applied Signal Processing.2004
  • 2Teunen R,Shahshahani B,Heck L P.A model based trans- formational approach to robust speaker recognition[].Proc ICSLP.2000
  • 3Yang Jyhher,Liao Yuanfu.Unseen handset mismatch compensation based on feature/model-space a priori knowledge interpolation for robust speaker recognition[].Proc of ISCSLP’.2004
  • 4Reynolds D A.HTIMIT and LLHDB: Speech corpora for the study of handset transducer effects[].Proc ICASSP’.1997
  • 5Kuhn H W,Tucker A W.Nonlinear programming[].Pro- ceedings of nd Berkeley Symposium.1951
  • 6Spellucci P.DONLP2[]..2008
  • 7Dempster A P,Laird N M,Rubin D B.Maximum Likelihood from Incomplete Data via the EM Algorithm (with Discussion)[].JRoyStatSocSer.1977
  • 8C J Leggetter,P C Woodland.Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[].Computer Speech and Language.1995
  • 9Reynolds D A,Quatieri T F,Dinn R B.Speaker verification using adapted gaussian mixture models[].Digital Signal Processing.2000

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部