期刊文献+

采用特征空间随机映射的鲁棒性语音识别 被引量:5

Robust speech recognition by adopting random projection in feature space
下载PDF
导出
摘要 针对语音识别性能受噪声干扰而显著降低的问题,提出一种采用特征空间随机映射(RP)的鲁棒性语音语音识别方法,并应用于汽车驾驶环境下的语音识别系统。首先,将原始语音特征参数采用随机矩阵线性映射到新的特征空间,使新的特征参数以最大概率保持原始特征之间距离的同时更加接近于高斯分布;然后训练隐马尔可夫模型(HMM),测试时结合多数投票表决方法对初始模式匹配结果进行判决并得到最终语音识别结果。采用日本情报处理学会车载环境下语音识别数据库CENSREC-2进行实验分析,结果表明,随机映射特征使得汽车驾驶环境下的语音识别性能有了很大改善。 To improve speech recognition in noisy environment such as in driving car,a new method which adopted Random Projection(RP) of feature space was proposed in this paper.First,original speech feature coefficients were projected into a new feature space using random matrixes to make the new coefficients have distribution more similar to the Gaussian but preserve the original distances among features with maximum probability.Then Hidden Markov Model(HMM) of every word was trained.In the test stage,the initial pattern matching results were further processed with majority voting strategy then to make a final speech recognition decision.The experimental results based on speech recognition database CENSREC-2 of Japan Information Processing Association demonstrate the effectiveness of random projection of feature space,which greatly improves the speech recognition performance in driving car.
出处 《计算机应用》 CSCD 北大核心 2012年第7期2070-2073,2081,共5页 journal of Computer Applications
关键词 语音识别 随机映射 多数投票表决 CENSREC-2 speech recognition Random Projection(RP) majority voting CENSREC-2
  • 相关文献

参考文献17

  • 1HUANG LIANG-SHENG, YANG C-H. A novel approach to robust speech endpoint detection in car environments [ C]//IEEE Interna- tional Conference on Acoustics, Speech, and Signal Processing. Pis- cataway: IEEE, 2000:1751-1754.
  • 2MA LONGHUA, WEI SHANGGUAN, ZANG YIHUA. Design of speech control system in car noise environments [ C]//2007 Inter- national Conference on Mechatronics and Automation. Piscataway: IEEE, 2007:3475 -3480.
  • 3AFIFY M, SIOHAN O. Sequential estimation with optimal forgetting for robust speech recognition [ J]. IEEE Transactions on Speech and Audio Processing, 2004, 12(1) : 19 - 26.
  • 4LI WEIFENG, ITOU K, TAKEDA K, et al. Adaptive regression based framework for in-car speech recognition [ C]//2009 IEEE In- ternational Conference on Acoustics, Speech, and Signal Process- ing. Piscataway: IEEE, 2006: 14- 19.
  • 5姜莹,俞一彪.采用特征分类直方图均衡化的鲁棒性语音识别[J].信号处理,2011,27(6):896-900. 被引量:4
  • 6MORENO P J, RAJ B, STERN R M. A vector Taylor series ap- proach for environment-independent speech recognition [ C]//1995 IEEE International Conference on Acoustics, Speech, and Signal Processing. Piscataway: IEEE, 1995:733-736.
  • 7GALES M J F, YOUNG S J. Robust continuous speech recognition using parallel model combination [ J]. IEEE Transactions on Speech and Audio Processing, 1996, 4(5): 352 -359.
  • 8ABOLHASSANI A H, SELOUANI S A, O'SHAUGHNESSY D. Speech enhancement using PCA and variance of the reconstruction error in distributed speech recognition [ C]//2007 IEEE Workshop on Automatic Speech Recognition and Understanding. Piscataway: IEEE, 2007: 19-23.
  • 9KAJAREKAR S S, YEGNANARAYANA B, HERMANSKY H. A study of two dimensional linear discriminants for ASR [ C]// IEEE International Conference on Acoustics, Speech, and Signal Process- ing. Piscataway: IEEE, 2001:137-140.
  • 10HYUNSIN P, TAKIGUCHI T, ARIKI Y. Integration of phoneme- subspaces using ICA for speech feature extraction and recognition [ C]// HSCMA 2008: Hands-Free Speech Communication and Mi- crophone Arrays. Piscataway: IEEE, 2008:148-151.

二级参考文献9

  • 1刘波,戴礼荣,王仁华,杜俊,李锦宇.基于双高斯GMM的特征参数规整及其在语音识别中的应用[J].自动化学报,2006,32(4):519-525. 被引量:4
  • 2R. C. Gonzalez, R. E. Woods. Digital Image Processing [ M ] , New Jersey, Prentice-Hall, 2002.
  • 3O. Viikki, K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recogni- tion[ J ]. Speech Communication, 1998,1 (25) : 133-147.
  • 4Hilger F, Molan S, Ney H. Quantile based histogram e- qualization for online application. Proceedings of Interna- tional Conference of Spoken Language Proceessing, Run- die Mall,Australia, Causal Productions,2002,237-240.
  • 5Segura J C, Benitez M C, de la Torre A, Rubio A J. Fea- ture extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR [ J ]. Pro- ceedings of International Conference of Spoken Language Processing 2002, Rundle Mall, Australia, Causal Produc- tions, 2002,225-228.
  • 6Segura J C, Benitez M C, de la Torre A. VTS residual noise compensation [ J ]. Proceedings of International Conference on Acoustics and Signal Processing 2002.Piscataway, USA, IEEE Press,2002,209-212.
  • 7J. C. Segura, C. Benitez, ~. de la Torre, A. J. Rubio, J. Ramfrez. Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speec Recognition [ J ]. IEEE Signal Processing Letters ,2004,5( 11 ) :517-520.
  • 8Young S,Evermann G, Hain T et al. The HTK Book (for HTK Version 3.2.1 ). 2002, http : ff htk. eng. cam. ac. uk.
  • 9H. Y. Jun. Filtering of Filter-Bank Energies for Robust Speech Recognition [ J ]. ETRI, 3 ( 26 ), 2004,273-276.

共引文献3

同被引文献29

引证文献5

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部