期刊文献+

基于核Fisher判别的分类器算法及其在语种识别中的应用研究

Novel Classifier Algorithm Based on Kernel Fisher Discriminant and its Application in Language Recognition
下载PDF
导出
摘要 GMM与SVM的建模和识别性能具有较好的互补性,因此GMM-SVM在语种识别中得到广泛使用,以其为基础的GMM-MMI-SVM已成为语种识别的主流研究方法。但是SVM在判别时仅仅使用了训练样本中的一些特殊样本即支持向量,并没有使用全部样本,从而影响了系统识别性能的进一步提高。针对上述问题,提出一种基于核Fisher判别的分类算法——GMM-MMI-KFD。该算法的核心思想是用核Fisher准则(KFD)替代SVM分类准则,从语音片段中提取出特征向量序列,分别通过GMM-MMI分类器与GMM-KFD分类器进行判决打分。相对SVM,KFD更注重语音数据非线性分布的特点,并且将样本向高维空间H上投影后可以最大限度地增大类间距,减小类内距。实验数据表明,GMM-MMI-KFD方法在语种识别中具有更高的识别率。 GMM and SVM have a good complementation on the modeling and recognition performance. Therefore, GMM-MMI-SVM has become a mainstream research method in language recognition. However, SVM only employs some special samples in the training samples,i, e. support vector, but doesn't use all samples. This affects further im- provement of system's recognition performance. In order to solve this problem, an novel classification algorithm based on Kernel Fisher Discriminant(KFD) was proposed in this paper, called GMM-MMI-KFD. The core idea is the substitu- tion of SVM with KFD, Extracting eigenvector sequence from voice segment, and then inputing them into GMM-MMI and GMM-KFD classifiers respectively, which judge them. Compared to SVM, KFD gets more emphasis on the charac-teristic of nonlinear distribution of voice data. Meanwhile, it can maximize between-class space and minimize within-class space after the projection of samples onto high-dimensional space. The experimental data shows that the GMM-MMI-KFD Classifier has higher recognition rate in language recognition.
出处 《计算机科学》 CSCD 北大核心 2013年第10期257-260,共4页 Computer Science
基金 国家自然科学基金(60872113)资助
关键词 语种识别 核FISHER判别 分类器融合 SVM GMM-MMI Language recognition, Kernel fisher discriminant, Classifier fusion, SVM, GMM-MMI
  • 相关文献

参考文献11

  • 1Campbell W M, Campbell J P, Reynolds D A, et al. Phonetic Speaker Recognition with Support Vector Machines[C] //Ad- vances in Neural Information Processing Systems. MIT Press, Cambridge, MA, 2004.
  • 2Richardson F S, Campbell W M. Language Recognition with Discriminative Keyword Selection [C]//Proc. of ICASSP 2008. Las Vegas, Nevada, U. S. A, 2008: 4145-4148.
  • 3Campbell W M,Richardson F,Reynolds D A. Language recogni- tion with word lattices and support vector machines[C]//Proc of ICASSP. 2006,11.
  • 4金恬,宋彦,戴礼荣.一种改进的PRSVM语种识别方法[J].小型微型计算机系统,2011,32(5):1017-1020. 被引量:2
  • 5Revathi A, Venkatararrmni Y. Speaker independent continuous speech and isolated digit recognition using VQ and HMM[C]// International Conference on Communications and Signal Pro- eessing. Washington, DC: IEEE Computer Society, 2011: 198- 202.
  • 6Zulfiqar A, Muhammad A, Martinez-Ertriquez A M, et al. Text- Independent Speaker Identification Using VQ-HMM Model Based Multiple Classifier System[J]. Lecture Notes in Computer Science, 2010,6438:116-125.
  • 7Torres-Carrasquillo P, Singer E, Gleason T, et al. The MITL LNISTLRE 2009 Language Recognition System[C]// IEEE In- ternational Conference on Acoustics, Speeeh,and Signal Proces- sing. Dallas,T X,2010:4994-4997.
  • 8Campbell W. A Covarianee Kernel For SVM Language Reeogni- tion[C]//IEEE International Conference on Acoustics,Speech, and Signal Processing. 2008.
  • 9Mika S, Ratsch G, Weston J, et al. Fisher diseriminant analysis with kernels[C]//Proceedings of the IEEE International Work- shop on Neural Networks for Signal Processing. Madison,USA, 1999,41-48.
  • 10Baudat G, Anouar F. Generalized diseriminant analysis using a kernel approach[J]. Neural Computation, 2000, 12 (10) : 2385- 2404.

二级参考文献15

  • 1Marc A Zissman. Comparison of four approaches to automatic lan- guage Identification of telephone spccch~ C ]. IEEE Trans Speech and Audio Processing, 1996,4( 1 ) :31-44.
  • 2Richardson F S, Campbell W M. Language recognition with dis- criminative keyword selection [ C ]. ICASSP 2008, IEEE Interna- tional Conference on March 31 2008-April 4, 2008,4145-4148.
  • 3Witten I H, Bell T C. The zero-frequency problem: estimating the probabilities of novel events in adaptive text compression [ J ]. IEEE Trans. Inform. Th., 1991,37(4):1085-1094.
  • 4Katz S. Estimation of probabilities from sparse data for the lan- guage model component of a speech recognizer[J]. IEEE Trans. on Acoustics, Speech, and Signal Processing, 1987, 35(3) :400- 401.
  • 5Matejka P, Schwarz P, Cemocky J, et al. Phonotactic language i- dentification using high quality phoneme recognition [C ]. Inter- speech, 2005,2237 -2241.
  • 6Campbell W M, Campbell J P, Reynolds D A, et al. Phonetic speaker recognition with support veztor machines[J ]. Advances in Neural Information Processing Systems 16, 2004.
  • 7Campbell W M, Campbell J P, et al. Support vector machines for speaker and language recognition[ J]. Computer Speech and Lan- guage, 200,20:210-229.
  • 8The 2007 NIST language recognition evaluation plan [ EB/OL ]. http://www. hist. gov/speeclz/tests/lang/2003/LRE07EvalPlan- v7e. pdf,2007.
  • 9Young S, et al. The HTK book ( Revised for HTK version 3.4) [Z]. 2006.
  • 10Torch SVM[ EB/OL]. http ://www. torch.ch,2002.

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部