Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition 被引量：2

Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition

导出

摘要 A feature extraction technique named perceptual MVDR-based cepstral coefficients （PMCCs） was introduced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models （GMMs） for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis （JFA） is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate （EER） by the factor between 7.6% and 30.5% as well as minimum detect cost function （minDCF） by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent. A feature extraction technique named perceptual MVDR-based cepstral coefficients （PMCCs） was introduced into speaker recognition. PMCCs are extracted and modeled using Gaussian Mixture Models （GMMs） for speaker recognition. In order to compensate for speaker and channel variability effects, joint factor analysis （JFA） is used. The experiments are carried out on the core conditions of NIST 2008 speaker recognition evaluation data. The experimental results show that the systems based on PMCCs can achieve comparable performance to those based on the conventional MFCCs. Besides, the fusion of the two kinds of systems can make significant performance improvement compared to the MFCCs system alone, reducing equal error rate （EER） by the factor between 7.6% and 30.5% as well as minimum detect cost function （minDCF） by the factor between 3.2% and 21.2% on different test sets. The results indicate that PMCCs can be effectively applied in speaker recognition and they are complementary with MFCCs to some extent.

作者 LIANGChunyan ZHANG Xiang YANG Lin ZHANG Jianping YAN Yonghong

机构地区 Key Laboratory of Speech Acoustics and Content Understanding

出处《Chinese Journal of Acoustics》 2012年第4期489-498,共10页 声学学报（英文版）

基金 supported by the National Natural Science Foundation of China(10925419,90920302, 61072124,11074275,11161140319) the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA06030100)

分类号 TN912.34 [电子电信—通信与信息系统] U666.7 [交通运输工程—船舶及航道工程]

引文网络
相关文献

参考文献16

1Davis S B, Mermelstein P. Comparison of parametric representations for monosyllabic word recog- nition in continuously spoken sentences. IF, RE Trans. on Acoustics, Speech, and Signal Processing, 1980; 28:357--366.
2Kinnunen T, Li H. An overview of text-independent speaker recognition: from features to super- vectors. Speech Communication, 2010.
3Murthi M N, Rao B D. All-pole modeling of speech based on the minimum variance distortionless response spectrum. IEEE Trans. on Speech and Audio Proeessing, 2000:221 -239.
4Makhoul J. Linear prediction-A tutorial review. In: Proc. IEEE, 1975; 63:501 580.
5Kay S M, Marple Jr S L. Spectrum analysis-A modern perspective. In: Proc, IEEE, 1981; 69: 1380-1419.
6Capon J. High-resolution frequency-wavenumber spectrum analysis. In: Proc. IEEE, 1969; 57: 1408-1418.
7Marple Jr S L. Digital spectral analysis with applications. Prentice-Hall, Englewood Cliffs, NJ, 1987.
8Haykin S. Adaptive filter theory. Prentice-Hall, Englewood Cliffs, N J, 1991.
9Yapanel U H, Dharanipragada S. Perceptual MVDR-based cepstral coefficients (PMCCs) for noise robust speech recognition. In: IEEE ICASSP03, 2003:644 -647.
10Hermansky H. Perceptural linear prediction (PLP) analysis of speech, or. Acoust. Soc. Am., 1990: 1738-1752.

同被引文献8

1ZHAO Li ZOU Cairong WU Zhenyang(Department of Radio Engineering, Southeast University Nanjing 210096) Received Sept. 9, 2000 Revised May 21, 2002.Integration of speech and language processing in Chinese continuous speech recognition[J].Chinese Journal of Acoustics,2002,21(4):343-351. 被引量：1
2陶智,赵鹤鸣,龚呈卉.基于听觉掩蔽效应和Bark子波变换的语音增强[J].声学学报,2005,30(4):367-372. 被引量：39
3杨阳,陈永明.声纹识别技术及其应用[J].电声技术,2007,31(2):45-46. 被引量：22
4丁佩律,张立明.结合主分量分析及Fisher准则的说话人识别方法研究[J].电路与系统学报,2002,7(1):116-119. 被引量：3
5邓秀慧.汉语数字耳语音识别研究[J].电声技术,2014,38(7):47-50. 被引量：2
6黄禹胜,张丕状,金东泽.基于Kalman的语音特征参数提取方法研究[J].电声技术,2015,39(5):62-65. 被引量：3
7张宇,刘坚强.基于SFA的改进MFCC特征提取算法[J].电声技术,2015,39(5):66-70. 被引量：1
8詹海峰,田红心,牛博,李从林.基于多分辨率高斯滤波器组的时频分析方法[J].中国电子科学研究院学报,2017,12(6):654-661. 被引量：5

引证文献2

1梁春燕,杨琳,汪俊杰,张建平,颜永红.音子配列学语种识别系统中特征选择方法的研究[J].声学学报,2013,38(2):208-214. 被引量：1
2倪纪伟,彭妙颜.基于Fisher比的Bark倒谱系数混合特征参数提取方法[J].电声技术,2019,43(1):30-33. 被引量：3

二级引证文献4

1王雪飞,刘珺.基于隐马可夫模型的邻近方言差异系数研究[J].计算机工程,2016,42(4):179-183.
2陈旭,蒋晔.基于高斯滤波器组混合特征的录音回放攻击检测研究[J].计算机工程,2021,47(3):291-297. 被引量：2
3段儒杰,行鸿彦,陈子正,刘洋.基于被动音频的低小慢目标探测方法[J].电子测量与仪器学报,2021,35(10):41-47. 被引量：6
4樊庆玲,杨宏波,郭涛,张伟,王威廉.FrFT-Bark域特征提取与CNN残差收缩网络心音分类算法[J].云南大学学报（自然科学版）,2023,45(3):564-574. 被引量：1

1孙传英.AXE10交换机公共信道信号系统的运行与维护[J].北京电信科技,1994(6):23-26.
2鲁希平,罗小波.丰田皇冠轿车CCS系统控制电路的检测（一）[J].汽车电器维修,2000(S03):21-23.
3鲁希平,罗小波.丰田皇冠轿车CCS系统控制电路的检测（二）[J].汽车电器维修,2000(S04):24-26.
4路远,龚利平.语音回放系统的Matlab实现研究[J].科技资讯,2008,6(8):80-81.
5亓家钟.钛零件在汽车上的应用[J].粉末冶金技术,2005,23(6):430-430.
6朱冬平.动车晚点引发对动车所CCS系统的思考[J].上海铁道科技,2016(1):107-107.
7张震,王化清.语音信号特征提取中Mel倒谱系MFCC的改进算法[J].计算机工程与应用,2008,44(22):54-55. 被引量：29
8程翔,韩昌彩,袁东风.在初步使用DSP的CCS系统时所遇问题的解答[J].山东电子,2003(2):50-50.
9信息之窗[J].铁道通信信号,2012,48(2):80-80.
10吴鸣山,孙余凯.汽车巡航控制系统[J].电子世界,2008(1):14-15.

Chinese Journal of Acoustics

2012年第4期

浏览历史

内容加载中请稍等...

Perceptual MVDR-based cepstral coefficients(PMCCs)for speaker recognition 被引量：2

参考文献16

同被引文献8

引证文献2

二级引证文献4

相关作者

相关机构

相关主题

浏览历史