期刊文献+

采用特征分类直方图均衡化的鲁棒性语音识别 被引量:4

Robust Speech Recognition Using Histogram Equalization of Classified Features
下载PDF
导出
摘要 大部分噪声会引起语音倒谱域特征参数的非线性失真,导致识别系统性能下降。直方图均衡化方法是一种非线性补偿变换技术,较传统的基于线性变换技术的抗噪声方法进一步提高了系统的鲁棒性。但实际识别系统中,除了噪声引起语音特征的非线性失真外,还存在训练和测试数据的语音特征类分布不一致问题,从而难以保证传统的直方图均衡化方法发挥其优势。本文提出一种基于特征分类的直方图均衡化方法,首先对初步均衡化后的含噪语音特征矢量进行K均值分类,然后对各类别下的特征矢量再进行直方图均衡变换。实验结果表明,低信噪比时无论在平稳噪声还是非平稳噪声环境下,与传统的直方图均衡化方法相比都进一步增强了识别系统的鲁棒性。 Noises cause feature distortion of speech and make the performance of speech recognition system seriously poor.Comparing with classical methods,the histogram equalization can reduce non-linear distortion and improve the robustness of speech recognition system quite well.However in many applications,the feature distribution between training and test speech is usually not identical because of their difference in phonetics or acoustics,then the validity of HEQ can be weaken.The proposed algorithm in this paper utilizes K-means clustering to classify the pre-equalized noisy features into several classes,then further equalizes the features belong to the same class.The experiments show the proposed method improves the performance of system with comparison of usual histogram equalization.
作者 姜莹 俞一彪
出处 《信号处理》 CSCD 北大核心 2011年第6期896-900,共5页 Journal of Signal Processing
基金 北京市"现代信息科学与网络技术"重点实验室暨铁道部"铁路信息科学与工程"开放实验室(编号:XDXX1006)的资助
关键词 语音识别 直方图均衡化 特征分类 鲁棒性 Speech recognition Histogram equalization Feature classification Robustness
  • 相关文献

参考文献9

  • 1刘波,戴礼荣,王仁华,杜俊,李锦宇.基于双高斯GMM的特征参数规整及其在语音识别中的应用[J].自动化学报,2006,32(4):519-525. 被引量:4
  • 2R. C. Gonzalez, R. E. Woods. Digital Image Processing [ M ] , New Jersey, Prentice-Hall, 2002.
  • 3O. Viikki, K. Laurila. Cepstral Domain Segmental Feature Vector Normalization for Noise Robust Speech Recogni- tion[ J ]. Speech Communication, 1998,1 (25) : 133-147.
  • 4Hilger F, Molan S, Ney H. Quantile based histogram e- qualization for online application. Proceedings of Interna- tional Conference of Spoken Language Proceessing, Run- die Mall,Australia, Causal Productions,2002,237-240.
  • 5Segura J C, Benitez M C, de la Torre A, Rubio A J. Fea- ture extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR [ J ]. Pro- ceedings of International Conference of Spoken Language Processing 2002, Rundle Mall, Australia, Causal Produc- tions, 2002,225-228.
  • 6Segura J C, Benitez M C, de la Torre A. VTS residual noise compensation [ J ]. Proceedings of International Conference on Acoustics and Signal Processing 2002.Piscataway, USA, IEEE Press,2002,209-212.
  • 7J. C. Segura, C. Benitez, ~. de la Torre, A. J. Rubio, J. Ramfrez. Cepstral Domain Segmental Nonlinear Feature Transformations for Robust Speec Recognition [ J ]. IEEE Signal Processing Letters ,2004,5( 11 ) :517-520.
  • 8Young S,Evermann G, Hain T et al. The HTK Book (for HTK Version 3.2.1 ). 2002, http : ff htk. eng. cam. ac. uk.
  • 9H. Y. Jun. Filtering of Filter-Bank Energies for Robust Speech Recognition [ J ]. ETRI, 3 ( 26 ), 2004,273-276.

二级参考文献11

  • 1de la Torre A,Segura J C,Benitez C.Non-linear transformations of the feature space for robust speech recognition.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.401~404
  • 2Hilger F,Ney H.Quantile based histogram equalization for noise robust speech recognition.In:Proceedings of European Conference on Speech Communication and Technology 2001.Aalborg,Denmark:ISCA,2001.1135~1138
  • 3Hilger F,Molau S,Ney H.Quantile based histogram equalization for online application.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.237~240
  • 4Hilger F,Molau S,Ney H.Evaluation of quantile based histogram equalization with filter combination on the Aurora 3 and 4 Databases.In:Proceedings of European Conference on Speech Communication and Technology.Grenoble,France:ISCA,2003.341~344
  • 5Segura J C,Benitez M C,de la Torre A,Ruiio A J.Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.225~228
  • 6Segura J C,Benitez M C,de la Torre A.VTS residual noise compensation.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2002.Piscataway,USA:IEEE Press,2002.409~412
  • 7Molau S,Hilger F,Keysers D,Ney H.Enhanced histogram normalization in the acoustic feature space.In:Proceedings of International Conference of Spoken Language Processing 2002.Rundle Mall,Australia:Causal Productions,2002.1421~1424
  • 8Molau S,Hilger F,Ney H.Feature space normalization in adverse acoustic conditions.In:Proceedings of International Conference on Acoustics,Speech,and Signal Processing 2003.Piscataway,USA:IEEE Press,2003.656~659
  • 9Dharanipragada S,Padmanabhan M.A nonlinear unsupervised adaptation technique for speech recognition.In:Proceedings of International Conference of Spoken Language Processing 2000.Beijing,China:Institute of Acoustics,2000.556~559
  • 10Dempster A P,Laird N M,Rubin D B.Maximum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society,1977,39(1):1~38

共引文献3

同被引文献54

  • 1董昱廷.陆空通话对飞行安全影响的分析[J].移动信息,2016(6):162-163. 被引量:3
  • 2Chin-Hui Lee,Mark A.Clements,Sorin Dusan.An Overview on Automatic Speech Attribute Transcription(ASAT) [C]// Conference on the International Speech Communication Association.Antwerp,Belgium;InterSpeech Express, 2007.1825-1828.
  • 3S.King,P.Taylor.Detection of phonological features in continuous speech recognition using neural networks[J]. Computer,Speech and Language,2000,14(4):333-353.
  • 4M.A.Siegler,R.M.Stern.On the effects of speech rate in large vocabulary speech recognition systems[C]// International Conference on Acoustics,Speech,and Signal Processing. Detroit,MI:ICASSP express,1995.612-615.
  • 5V.R.Gadde,K.Sonmez,H.Franco.Multirate ASR Models for Phone-class Dependent N-best List Rescoring [C]//IEEE Workshop on Automatic Speech Recognition and Understanding(ASRU ).San Juan:IEEE express, 2005.157-161.
  • 6S.Dimopoulos,A.Potamianos,E.-F.Lussier,L.Chin-Hui. Multiple time resolution analysis of speech signal using MCE training with application to speech recognition [C]// International Conference on Acoustics,Speech, and Signal Processing.Tai Bei:IEEE express,2009. 3801-3804.
  • 7I-F Chen,Hsin-Min Wang.Articulatory Feature Asynchrony Analysis and Compensation in Detection-Based ASR//.International Speech Communication Association, Brighton United Kingdom,2009:3059-3062.
  • 8Zoltan Tuske,Christian Plahl,Ralf Schluter.A study on Speaker Normalized MLP Features in LVCSR[C]//Conference on the International Speech Communication Association. Florence,Italy,2011:1089-1092.
  • 9N.Strom,.“The NICO Artificial Neural Network Toolkit”, http://nico.nikkostrom.com.
  • 10Frantisek Grezl.Trap-Based Probabilistic Features For Automatic Speech Recognition[D].Brno,CZ:Brno University of Technology,2007.

引证文献4

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部