期刊文献+

基于听觉模型的说话人语音特征提取 被引量:2

Feature extraction for speaker recognition based on auditory model
下载PDF
导出
摘要 基于听觉模型的特性,仿照MFCC参数提取过程,提出了一种基于Gammatone滤波器组的说话人语音特征提取方法。该方法用Gammatone滤波器组代替三角滤波器组求得倒谱系数,并且可以调整Gammatone滤波器组的通道数和带宽。将该方法所求得的特征在高斯混合模型识别系统中进行仿真实验,实验结果表明,该特征在一定情况下优于MFCC特征在系统的识别率,同时在Gammatone滤波器组通道数较高或滤波器带宽较小的情况下,系统具有较高的识别率。 In this paper, a novel feature based on an auditory model and Gammatone filter band is proposed for speaker recognition, which imitates the parameters extraction process of MFCC. The frequency cepstrum coefficient features are calculated using a Gammatone filter band instead of commonly used triangle filter band. Moreover, the dimension and the equivalent rectangular bandwidth of Gammatone filter band could be adjusted. Simulation results with Gaussian mixture model indicate that the recognition rate is significantly improved compared with MFCC in some condition, and the correct recognition rate is higher by more dimensions or smaller equivalent rectangular bandwidth.
作者 何朝霞 潘平
出处 《微型机与应用》 2012年第1期37-39,共3页 Microcomputer & Its Applications
基金 国家科技计划基金资助项目(2008RR0003) 贵州省国际科技合作计划基金资助项目([2009]700109 [2009]700125)
关键词 听觉模型 Gammatone滤波器组 MFCC 特征 识别率 auditory model Gammatone filter band MFCC feature recognition rate
  • 引文网络
  • 相关文献

参考文献8

  • 1JOHANNESMA P I M. The pre-response stimulus ensemble of neurons in the cochlear nucleus [C]. Proceedings of the Symposium on Hearing Theory, 1972:58-69.
  • 2COOKE M P. Modeling auditory proeessing and organization[M]. Cambridge,U.K : Cambridge University Press, 1993.
  • 3韩纪庆,张磊,郑铁然.语音信号处理[M].北京:清华大学出版社,2008.
  • 4SIANEY M. An efficient implementation of the patterson- holdswort auditory filter bank. Apple ComPuter Technical RePort#35 Perception GrouP-Advanced Technology GrouP [R]. ComPuter, Inc:Apple, 1993.
  • 5Shao Yang, Wang Deliang. using auditory features and Robust speaker identification computational auditory scene analysis [C]. IEEE International Conference on Acoustics, Speech, and Signal Processing,2008,5 : 1589.
  • 6SRINIVASAN S, Wang Deliang. Transforming Binary uncertainties for robust speech recognition [C]. IEEE Transactions on Audio, Speech and Language Processing, 2007,15(7) :2130-2140.
  • 7Wang Deliang, BROWN G J. Computational auditory scene analysis: principles, algorithms, and applications[M]. Hoboken, N J: Wiley-IEEE Press, 2006.
  • 8王玥,钱志鸿,王雪,程光明.基于伽马通滤波器组的听觉特征提取算法研究[J].电子学报,2010,38(3):525-528. 被引量:28

二级参考文献11

  • 1S Furui. Digital Speech Processing, Synthesis, and Recognition [ M]. New York: Marcel Dekker, 2001.
  • 2H Gish, M Schmidt. Text-independent speaker identification [ J]. IEEE Signal Proc, 1994,11 (4): 18 - 32.
  • 3D A Reynolds, et al. The SuperSID project: Exploiting high- level information for high-accuracy speaker recognition [ A ]. International Conference on Acoustics, Speech, and Signal Processing[ C]. Hong Kong, China: IEEE, 2003.4:784 - 787.
  • 4A Drygajlo,M El-Maliki. Speaker verification in noisy environments with combined spectral subtraction and missing feature theory [ A ]. IEEE International Conference on Acoustics, Speech, and Signal Processing[ C]. Seattle, USA: IEEE, 1998. 1 : 121 - 124.
  • 5SHAO Y, WANG D L. Robust speaker recognition using binary time-frequency masks [ A ]. IEEE International Conference on Acoustics,Speech,and Signal Processing[ C]. Toulouse: IEEE, 2006.1:645-648.
  • 6Z Wanfeng, Y Yingchun, W Zhaohui, S Lifeng. Experimental evaluation of a new speaker identification framework using PCA[ A]. IEEE. International Conference on Systems, Man and Cybernetics[C]. Washington, DC: IEEE., 2003.4147 - 4152.
  • 7WU Xihong. A Chinese Speech Database for Speaker Recognition[ EB/OL]. http://nlpr-web. ia. ac. cn/englisb_/irds/chinese / sinobiometrics- pdf/wuxihong.pdf, 2002.
  • 8D A Reynolds, R C Rose. Robust text-independent speaker identification using Gaussian mixture speaker models[ J].Proc IEEE. Trans Speech Audio Process, 1995,3 ( 1 ) : 72 - 83.
  • 9YOUNG S, EVERMANN G, GALES M, et al. The HTK Book[ M]. Cambridge: Cambridge University, 2006.
  • 10WNG L,KITAOKA N,NAKAGAWA S. Analysis of effect of compensation parameter estimation for CMN on speech/speaker recognition[ A]. 9th International Symposium on Signal Processing and Its Applications[ C]. Sharjah: IEEE, 2007.1 - 4.

共引文献30

同被引文献24

  • 1高雨青,黄泰翼,陈韶岩.听觉模型用于语音识别以及与一般方法的比较[J].电子学报,1993,21(10):1-6. 被引量:2
  • 2刘惠华,赵南明,方棣棠.听觉模型研究的意义与现状[J].生命科学,1993,5(2):13-15. 被引量:1
  • 3焦志平,张雪英,赵姝彦.一种基于听觉模型的抗噪语音识别特征提取方法[J].太原理工大学学报,2005,36(1):13-15. 被引量:8
  • 4TAO Ran,DENG Bing,WANG Yue.Research progress of the fractional Fourier transform in signal processing[J].Science in China(Series F),2006,49(1):1-25. 被引量:100
  • 5尹辉,谢湘,匡镜明.一种基于Gammatone滤波和Fr FT的抗噪语音识别方法[C]//第十届全国人机语音通讯学术会议暨国际语音语言处理研讨会论文摘要集.北京:清华大学出版社,2009:5-8.
  • 6Shao Yang,Wang Deliang.Robust speaker identification using auditory features and computational auditory scene analysis[C]//Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP2008),March 30-April 4,2008.[S.l.]:IEEE,2008:1589-1592.
  • 7张雪英.数字语音处理及MATLAB仿真[M].北京:电子工业出版社,2003.
  • 8Zhao Xiaojia,Shao Yang,Wang Deliang.CASA-based robust speaker identification[J].IEEE Transactions on Audio,Speech and Language Processing,2012,20(5):1608-1616.
  • 9He Xu,Lin Lin.A new algorithm for auditory feature extraction[C]//Proceedings of International Conference on Communication Systems and Network Technologies.Washington,DC,USA:IEEE Computer Society,2012:229-232.
  • 10Shao Yang,Jin Zhaozhang,Wang Deliang.An auditorybased feature for robust speech recognition[C]//Proceedins of International Conference on Acoustics,Speech and Signal Processing(ICASSP2009),19-24 April,2009.[S.l.]:IEEE,2009:4625-4628.

引证文献2

二级引证文献14

;
使用帮助 返回顶部