期刊文献+

Application of formant instantaneous characteristics to speech recognition and speaker identification

Application of formant instantaneous characteristics to speech recognition and speaker identification
下载PDF
导出
摘要 This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant characteristics can be represented by instantaneous frequency (IF) and instantaneous bandwidth, namely formant instantaneous characteristics (FIC). In order to explore the importance of FIC both in SR and SI, this paper proposes different features from FIC used for SR and SI systems. When combing these new features with conventional parameters, higher identification rate can be achieved than that of using Mel-frequency cepstral coefficients (MFCC) parameters only. The experiment results show that the new features are effective characteristic parameters and can be treated as the compensation of conventional parameters for SR and SI. This paper proposes a new phase feature derived from the formant instantaneous characteristics for speech recognition (SR) and speaker identification (SI) systems. Using Hilbert transform (HT), the formant characteristics can be represented by instantaneous frequency (IF) and instantaneous bandwidth, namely formant instantaneous characteristics (FIC). In order to explore the importance of FIC both in SR and SI, this paper proposes different features from FIC used for SR and SI systems. When combing these new features with conventional parameters, higher identification rate can be achieved than that of using Mel-frequency cepstral coefficients (MFCC) parameters only. The experiment results show that the new features are effective characteristic parameters and can be treated as the compensation of conventional parameters for SR and SI.
出处 《Journal of Shanghai University(English Edition)》 CAS 2011年第2期123-127,共5页 上海大学学报(英文版)
基金 Project supported by the National Natural Science Foundation of China (Grant No.60903186) the Shanghai Leading Academic Discipline Project (Grant No.J50104)
关键词 instantaneous frequency (IF) Hilbert transform (HT) speech recognition speaker identification Mel-frequency cepstral coefficients (MFCC) instantaneous frequency (IF), Hilbert transform (HT), speech recognition, speaker identification, Mel-frequency cepstral coefficients (MFCC)
  • 相关文献

参考文献14

  • 1甄斌,吴玺宏,刘志敏,迟惠生.语音识别和说话人识别中各倒谱分量的相对重要性[J].北京大学学报(自然科学版),2001,37(3):371-378. 被引量:74
  • 2PICONE J. Continuous speech recognition using hidden Markov model [J]. IEEE ASSP Magazine, 1990, 17(3): 26-41.
  • 3REYNOLDS D A, ROSE R C. Robust text-independent speaker identification using Gaussian mixture speaker models [J]. IEEE Transactions on Speech and Audio Processing, 1995, 3(1): 72-83.
  • 4ZHANG W Y, RAO B D. Two microphone based direction of arrival estimation for multiple speech sources using spectral properties of speech [C]// Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Taipei. 2009: 2193-2196.
  • 5ALSTERIS L D, PALIWAL K K. Iterative reconstruction of speech from short-time Fourier transform phase andmagnitude spectra [J]. Computer Speech and Language, 2007, 21(1): 174-186.
  • 6ALSTERIS L D, PALIWAL K K. Short-time phase spet trum in speech processing: A review and some experimental results [J]. Digital Signal Processing, 2001 17(3): 578-616.
  • 7HEQDE R M, MURTHY H A, GADDEON V R R. Significance of the modified group delay feature in speech recognition [J]. IEEE Transactions Audio, Speech, and Language Processing, 2007, 15(1): 190-202.
  • 8PLUMPE M D, QUATIERI T F, REYNOLDS D A. Mod- eling of the glottal flow derivative waveform with application to speaker identification [J]. IEEE Transactions on Speech and Audio Processing, 1999, 7(5): 569-586.
  • 9REILLY A, FRAZER G, BOASHASH B. Analytic signal generation-tips and trap [J]. IEEE Transactions on Signal Processing, 1994, 42(11): 3241-3245.
  • 10DIMITRIADIS D, MARAGOS P. Continuous energy demodulation methods and application to speech analysis [J]. Speech Communication, 2006, 48(7): 819-837.

二级参考文献3

  • 1杨行峻 迟惠生.数字语音信号处理[M].北京:电子工业出版社,1995..
  • 2Zhen B,Proceedings ICSLP Ⅱ,2000年,933页
  • 3杨行峻,数字语音信号处理,1995年

共引文献73

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部