期刊文献+

最小方差无失真响应感知倒谱系数在说话人识别中的应用 被引量:4

Perceptual MVDR-based cepstral coefficients for speaker recognition
下载PDF
导出
摘要 研究最小方差无失真响应感知倒谱系数在说话人识别中的应用。提取最小方差无失真响应感知倒谱系数,对其进行高斯混合模型建模并采用联合因子分析的方法来拟合高斯混合模型中的说话人和信道差异,在美国国家标准技术研究院2008年说话人识别评测核心测试集上分别对最小方差无失真响应感知倒谱系数和传统的Mel频率倒谱系数进行测试。结果显示,两种不同特征的系统性能相当,采用线性融合方法后,在不同测试集上的等错误率相对下降了7.6%~30.5%,最小检测错误代价相对下降了3.2%~21.2%。实验表明,最小方差无失真响应感知倒谱系数能有效应用于说话人识别中,且与传统的Mel频率倒谱系数存在一定程度的互补性。 A new feature extraction technique named perceptual MVDR-based cepstral coefficients (PMCCs) is intro- duced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models (GMMs) for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis (JFA) is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate (EER) by the factor between 7.6% and 30.5% as well as minimum detect cost function (minDCF) by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent.
出处 《声学学报》 EI CSCD 北大核心 2012年第6期673-678,共6页 Acta Acustica
基金 国家自然科学基金资助项目(10925419,90920302,10874203,60875014,61072124,11074275)
  • 相关文献

参考文献16

  • 1Davis S B, Mermelstein P. Comparison of parametric rep- resentations for monosyllabic word recognition in contin- uously spoken sentences. IEEE Trans. On Acoustics, Speech, and Signal Processing, 1980; 28:357--366.
  • 2Kinnunen T, Li H. An overview of text-independent speaker recognition: from features to supervectors. Speech Communication, 2010.
  • 3Murthi M N, Rao B D. All-pole modeling of speech based on the minimum variance distortionless response spectrum. IEEE Trans. On Speech and Audio Processing, 2000: 221--239.
  • 4Makhoul J. Linear prediction-A tutorial review. In: Proc. IEEE, 1975; 63:501--580.
  • 5Kay S M, Marple Jr S L. Spectrum analysis-A modern per- spective. In: Proc, IEEE, 1981; 69:1380--1419.
  • 6Capon J. High-resolution frequency-wavenumber spectrum analysis. In: Proc. IEEE, 1969; 57:1408--1418.
  • 7Marple Jr S L. Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs, N J, 1987.
  • 8Haykin S. Adaptive filter theory. Prentice-Hall, Englewood Cliffs, N J, 1991.
  • 9Yapanel U H, Dharanipragada S. Perceptual MVDR-based cepstral coefficients (PMCCs) for noise robust speech recognition. In: IEEE ICASSP03, 2003:644 647.
  • 10Hermansky H. Perceptural linear prediction (PLP) analysis of speech. J. Acoust. Soc. Am., 1990:1738 1752.

同被引文献38

引证文献4

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部