期刊文献+

基于SVM评分融合的分类短语音话者确认系统

Classification Speaker Verification System with Short Speech Based on SVM Fusing Scores
下载PDF
导出
摘要 对于与文本无关短电话语音(小于30s)的话者确认,在特征参数空间上分类并分别建模的方法,会带来多个子系统输出融合的问题。为了得到最终的评分,同时反映出各个子系统之间的非线性关系以及贡献的不同。本文提出了使用支持向量机(Supportvectormachine,SVM)进行后端评分融合的方法,对输出的两类评分矢量(目标话者和冒认话者)进行分类。在NIST′03数据库上的实验表明,在短语音情况下该方法比评分相加融合方法性能可以相对提高约11%,SVM不仅适用于多子系统的评分级的融合,对其他的多系统多信息的融合也行之有效。 In order to solve the problem of text-independent speaker verification with telephone speech less than 30 s, a method for classifying the speakers′ parameters and separately modeling is presented. However it also brings a problem of the sub-systems fusion. For the sake of getting the last score and reflecting the nonlinear relations and the contributions of each sub-system, a support vector machine (SVM) based fusion model is proposed. The SVM model can classify two kinds of scores (target & non-target). Experiments on the database of NIST′03 show that the verification performance with SVM method can relatively improve about 11% as to score adding method with the short speech. SVM based score combination approach is useful for score fusion of the sub-systems and valuable for other multi-system and multi-information fusions.
出处 《数据采集与处理》 CSCD 北大核心 2005年第2期213-217,共5页 Journal of Data Acquisition and Processing
基金 国家自然科学基金(60272039)资助项目。
关键词 信息融合 支持向量机 高斯混合模型-背景模型 话者确认 information fusion support vector machine GMM-UBM speaker verification
  • 相关文献

参考文献8

  • 1NIST.The NIST year 2004 speaker recognition e-valuation plan[EB/OL].http://www.nist.gov/speech/tests/spk/2004/SRE-04evalplan-v1a.pdf,2004.
  • 2Reynolds D A.An overview of automatic speaker recognition technology[A].Proc of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'02)[C].2002,4:4072~4075.
  • 3Ramaswamy G N, Navratil A, Chaudhari U V, et al.The "IBM system for the NIST-2002 cellular speaker verification" evaluation acoustics, speech, and signal processing[A].Proc of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP'03)[C].2003,2:61~64.
  • 4黄伟,戴蓓蒨,李辉.基于分类高斯混合模型和神经网络融合的与文本无关的说话人识别[J].模式识别与人工智能,2003,16(4):423-428. 被引量:4
  • 5Reynolds D A, Quatieri T, Dunn R.Speaker verification using adapted Gaussian mixture models[J].Digital Signal Processing, 2000,10:19~41.
  • 6Farrell K, Ramachandran R, Mammone R J.An analysis of data fusion methods for speaker verification[A].Proceedings ICASSP[C].1998.1129~1132.
  • 7Xu L, Krzyzak A, Suen C Y.Methods of combining multiple classifiers and their applications to handwriting recognition[J].Systems, Man and Cybernetics, 1992,22(3):418~435.
  • 8Burges C J C.A tutorial on support vector machines for pattern recognition[J].Knowledge Discovery and Data Mining, 1998,2(2):121~167.

二级参考文献11

  • 1Reynolds D A. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 2000, 10:19 -41
  • 2Reynolds D A, Rose R C. Robust Text-Independent Speaker Identification Using Gaus,sian Mixture Speaker Models. IEEE Trans onSpeech Audio Process, 1995, 3: 72- 83
  • 3Reynolds D A. An Overview of Automatic Speaker RecognitionTechnology. In: Proc of the International Conference on Acoustics,Speech and Signal Processing (ICASSP'02). Orlando, FL, USA,2002, Ⅳ : 4072 - 4075
  • 4NIST. The 2001 NIST Speaker Recognition Evaluation Plan.http://www, nist. gov/speech/tests/spk/. The Official Website for the NIST Speaker Recognition Evaluations.
  • 5Martin A, Przybocki M. The NIST 1999 Speaker Recognition Evaluation - an Overview. Digital Signal Processing, 2000, 10:10 -18
  • 6Reynolds D A. Speaker Identification and Verification Using Gaussian Mixture Speaker Models. Speech Communication, 1995, 17:91 - 108
  • 7Reynolds D A. Comparison of Background Normalization Methods for Text-Independent Speaker Verification. In: Proc of European Conference on Speech Communication and Technology (EU-RCSPEECH). Rhodos, Greece, 1997, Ⅱ: 963-966
  • 8Reynolds D A. Experimental Evaluation of Features for Robust Speaker Identification. IEEE Trans on Speech Audio Process,1994, 2:639-643
  • 9Deller J R, Proakisa J G, Hansenm J H L. Discrete-Time Processing of Speech Signals. New York: Macmillan Publishing Company,1993
  • 10Chang E, Shi Y, Zhou J, Huang C. Speech Lab in a Box: A Man- darin Speech Toolbox to Jumpstart Speech Related Research. In: Proc of European Conference on Speech Communication and Technology (EURCSPEECH). Aalborg, Denmark, 2001, 192 - 199

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部