期刊文献+

采用韵律特征的说话人确认系统 被引量:1

Speaker Verification Based on Prosodic Features
下载PDF
导出
摘要 在文本无关的说话人识别中,韵律特征由于其对信道环境噪声不敏感等特性而被应用于话者识别任务中。本文对韵律参数采用基于高斯混合模型超向量的支持向量机建模方法,并将类内协方差特征映射方法应用于模型超向量上,单系统的性能比传统方法的混合高斯-通用背景模型(Gaussian mixture model-universalbackground model,GMM-UBM)基线系统有了40.19%的提升。该方法与本文的基于声学倒谱参数的确认系统融合后,能使整体系统的识别性能有9.25%的提升。在NIST(National institute of standards and technology mixture)2006说话人测试数据库上,融合后的系统能够取得4.9%的等错误率。 In the text-independent speaker recognition system, prosodic features are widely used to verify the speaker identity because they are less sensitive to the channel and noisy effect than cepstral ones. This paper proposes a verification method, called the prosody Gaussian co- variance projection-support vector machine (PGCP-SVM). The method is based on the pitch and the energy, and their dynamic features. Different from the conventional techniques, target speaker models are modeled by support vector machine (SVM) based on the Gaussian mixture model (GMM) mean-supervectors. Particullarly, the within-class covariance projection tech- nique is used to the mean-supervectors. The projection approach can improve the prosodic sys- tem performance. Combined with the acoustic Mel-frequency cepstral coefficient system, the performance is improved by 9. 25%. In the NIST 2006 SRE corpus, the equal error rate (EER) of the combined system can reach 4.9%.
出处 《数据采集与处理》 CSCD 北大核心 2010年第1期76-80,共5页 Journal of Data Acquisition and Processing
基金 NSFC-微软亚洲研究院联合资助项目 国家自然科学基金(60970161)资助项目
关键词 说话人确认 韵律特征 超向量 类内协方差 speaker verification prosodic features supervector within-class covariance projection
  • 相关文献

参考文献13

  • 1Reynolds D A, Quatieri T F, Dunn R B. Speaker verification using adapted Gaussian mixture models [J]. Digital Signal Processing, 2000, 10(3):19-41.
  • 2Kenny P, Boulianne G, Quellet P, et al. Joint factor analysis versus eigenchannels in speaker recognition [J]. IEEE Trans Audio Speech and Language Processing, 2007, 15(4) : 1435-1447.
  • 3Campbell W M, Sturim D E, Reynolds D A, et al. SVM based speaker verification using a GMM supervector kernel and NAP variability compensation [C]//Proc ICASSP 2006. Toulouse, France:[s. n. ], 2006, I:97-100.
  • 4Dehak Najim, Demouchel Pierre, Kenny Patrick. Modeling prosodic feature with joint factor analysis for speaker verification [J]. IEEE Trans Audio Speech and Language Processing, 2007, 15: 2095- 2103.
  • 5Brendan B, Robbie V, Sridha S. Gaussian mixture modeling of broad phonetic and syllable events for text-independent speaker verification[C]//Proc Inter speech 2005. Lisbon, Portugal:[s. n.], 2005:2429- 2432.
  • 6Zeng Yumin, Wu Huayu, Gao Rong. Pitch synchronous analysis method and Fisher criterion based speaker identification[C]//Proc ICNC2007. [S. l.]: Natural Computation, 2007,2 : 691-695.
  • 7Boersma P, Weenink D. Praat: doing phonetics by computer [EB/OL ]. http ..//www. praat, org, 2008- 03-25.
  • 8Andrew O H, Andreas S. Generalized linear kernels for one-vs-all classification: application to speaker recognition [C]//Proc ICASSP 2006. Toulouse, France : [s. n.], 2006 : 585-588.
  • 9Sachin S K, Andreas S. NAP and WCCN: comparison of approaches using MLLR-SVM speaker verification system [C]//Proc ICASSP 2007. Hawaiik, Honolulu, U S A:[s. n.],2007:249-252.
  • 10Matejka P, Burget L, Schwarz P, et al. STBU system for the NIST 2006 spe'aker recognition evaluation[C]//Proc ICASSP 2007.,Hawaiik, U S A: [s. n. ],2007: IV-221- IV-224.

同被引文献20

  • 1龙艳花,郭武,戴礼荣.用于SVM说话者确认系统的序列核[J].清华大学学报(自然科学版),2008,48(S1):688-692. 被引量:1
  • 2郭武,戴礼荣,王仁华.采用UBM更新量作为支持向量机特征的说话人确认[J].清华大学学报(自然科学版),2008,48(S1):704-707. 被引量:4
  • 3王飒,郑链.基于Fisher准则和特征聚类的特征选择[J].计算机应用,2007,27(11):2812-2813. 被引量:21
  • 4Kenny P, Boulianne G, Ouellet P. et al. Speaker and Session Variability in GMM -Based Speaker Verification[ C]// IEEE transactions on audio, speech, and language processing. USA : IEEE Press, 2007 : 1448 - 1460.
  • 5Campbell W, Sturim D, Reynolds D. Support vector machines using GMM supervectors for speaker verification [J]. Signal Process Letters, 2006, 13(5) :308 -311.
  • 6Chang Huai You, Kong Aik Lee, Haizhou Li. GMM - SVM Kernel With a Bhattacharyya - Based Distance for Speaker Recognition [ C ]// IEEE Transactions on Audio, Speech, and Language Processing. USA : IEEE Press, 2009 : 1300 - 1312.
  • 7Shan Zhong, Yuxiang Shan, Liang He, et al. Research on Intercession Variability Compensation for MLLR - SVM Speaker Recognition [J].IEICE Transactions on fundamentals of electronics, communications & computer sciences. USA: IEEE Press, 2009, E92/A (8) : 1913 - 1919.
  • 8Wright J, Yang A Y, Ganesh A, et al. Robust face recognition via sparse representation [ J ]. Pattern Analysis and Machine Intelligence, 2008, 31(2): 210-227.
  • 9Naseem I,Togneri R,Bennamoun M. Sparse Representation for Speaker Identification[ C]//20th International Conference on Pattern Recog- nition (ICPR), Istanbul. USA: IEEE Press, IEEE Press,2010:4460 -4463.
  • 10Micha! Aharon, Michael Elad, Alfred Bruckstein. K - SVD : An Algorithm for Designing Overcomplete Dictionaries for Sparse Representa- tion [ J]. IEEE Transactions on Signal Processing. USA: IEEE Press, 2006:4311 -4322.

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部