期刊文献+

用于抗噪声语音识别的谐振强度特征 被引量:1

Harmonic intensity feature for robust speech recognition
原文传递
导出
摘要 基于传统的Mel倒谱系数(MFCC)系列特征的语音识别系统在噪声环境中的识别性能会急剧下降。为了进行噪声环境中的自动语音识别,提出了一种反映语音信号谐振程度的特征:谐振强度,并用之代替传统MFCC特征中的能量维(零维倒谱C0,或者帧能量E)。在展览馆噪声、人群噪声和汽车噪声等情况下的语音识别实验结果表明:基于这种新特征的语音识别系统比基于传统特征的语音识别系统有更高的平均识别率和更好的抗噪声能力。 Automatic speech recognition (ASR) in noisy environments is a challenging problem. The performance of traditional Mel-frequency cepstral coefficient (MFCC) feature based ASR systems is dramatically degraded by additive noise. The harmonic intensity (H) feature was used to develop a robust ASR to replace the zero-order cepstral coefficient (C_0) or frame energy (E) feature in the MFCCs. A C_0-based ASR system, an E-based ASR system, and an H-based ASR system were tested with noise corrupted speech. The results show that the H-based ASR system has higher recognition accuracy and better robustness than the other systems.
作者 许超 曹志刚
出处 《清华大学学报(自然科学版)》 EI CAS CSCD 北大核心 2004年第1期22-24,28,共4页 Journal of Tsinghua University(Science and Technology)
基金 国家自然科学基金资助项目(60072011)
关键词 抗噪声 语音识别 谐波模型 MEL倒谱系数 speech recognition robustness harmonic model
  • 相关文献

参考文献6

  • 1Young S, Evermann G, Kershaw D, et al. The HTK Book [EB/OL]. http://htk.eng.cam.ac.uk/docs/docs.shtml, 2002.
  • 2Mark John Francis Gales. Model-Based Techniques for Noise Robust Speech Recognition [D]. University of Cambridge, Gonville and Caius College, 1995.
  • 3McAulay R J, Quatieri T F. Speech analysis/synthesis based on a sinusoidal representation [J]. IEEE Trans on Acoustics, Speech, and Signal Processing, 1986, 8(4): 744-754.
  • 4Abu-Shikhah N, Deriche M. A Robust technique for harmonic analysis of speech [J]. Proc ICASSP'01 - Proceedings, 2001, (2): 877-880.
  • 5Virtanen T, Klapuri A. Separation of harmonic sounds using linear models for the overtone series [J]. ICASSP'02 - Proceedings, 2002(2): 1757-1760.
  • 6Pearce D, Hirsch H-G. The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions [J]. ICSLP'00 - Proceedings, 2000, (4): 29-32.

同被引文献7

  • 1Hermansky H. Perceptual Linear Predictive (PLP) Analysis of Speech. Journal of the Acoustical Society of America, 1990, 87(4): 1738-1752.
  • 2You K H, Wang H C. Robust Features for Noisy Speech Recognition Based on Temporal Trajectory Filtering of Short Time Autocorrelation Sequences. Speech Communication, 1999, 28:13-24.
  • 3Cooke M, Green P, Josifovski L, Vizinho A. Robust Automatic Speech Recognition with Missing and Unreliable Acoustic Data.Speech Communication, 2001, 34:267-285.
  • 4Luo Y, Du L M. Single Gauss Model Set-Based Data Imputation Method for Complex ASR Task. In: Proc of the International Symposium on Circuits and Systems. Bangkok, Thailand,2003, Ⅱ : 564-567.
  • 5Varga A, Steeneken H J M. Assessment for Automatic Speech Recognition: H. NOISEX-92: A Database and an Experiment to Study the Effect of Additive Noise on Speech Recogniiton Systems. Speech Communication, 1993, 12(3), 247-251.
  • 6Young S, etal. The HTK Book (for HTK Version 3.0). Cambridge, UK, Cambridge University Technical Services, 2000.
  • 7蒋文建,林耀荣,韦岗.基于响度特性加权的噪声下语音识别方法[J].模式识别与人工智能,2001,14(2):166-170. 被引量:7

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部