期刊文献+

Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition 被引量:2

Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition
原文传递
导出
摘要 A feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) was introduced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate (EER) by the factor between 7.6% and 30.5% as well as minimum detect cost function (minDCF) by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent. A feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) was introduced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate (EER) by the factor between 7.6% and 30.5% as well as minimum detect cost function (minDCF) by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent.
出处 《Chinese Journal of Acoustics》 2012年第4期489-498,共10页 声学学报(英文版)
基金 supported by the National Natural Science Foundation of China(10925419,90920302, 61072124,11074275,11161140319) the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA06030100)
  • 相关文献

参考文献16

  • 1Davis S B, Mermelstein P. Comparison of parametric representations for monosyllabic word recog- nition in continuously spoken sentences. IF, RE Trans. on Acoustics, Speech, and Signal Processing, 1980; 28:357--366.
  • 2Kinnunen T, Li H. An overview of text-independent speaker recognition: from features to super- vectors. Speech Communication, 2010.
  • 3Murthi M N, Rao B D. All-pole modeling of speech based on the minimum variance distortionless response spectrum. IEEE Trans. on Speech and Audio Proeessing, 2000:221 -239.
  • 4Makhoul J. Linear prediction-A tutorial review. In: Proc. IEEE, 1975; 63:501 580.
  • 5Kay S M, Marple Jr S L. Spectrum analysis-A modern perspective. In: Proc, IEEE, 1981; 69: 1380-1419.
  • 6Capon J. High-resolution frequency-wavenumber spectrum analysis. In: Proc. IEEE, 1969; 57: 1408-1418.
  • 7Marple Jr S L. Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs, NJ, 1987.
  • 8Haykin S. Adaptive filter theory. Prentice-Hall, Englewood Cliffs, N J, 1991.
  • 9Yapanel U H, Dharanipragada S. Perceptual MVDR-based cepstral coefficients (PMCCs) for noise robust speech recognition. In: IEEE ICASSP03, 2003:644 -647.
  • 10Hermansky H. Perceptural linear prediction (PLP) analysis of speech, or. Acoust. Soc. Am., 1990: 1738-1752.

同被引文献8

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部