期刊文献+

维吾尔语连续语音识别声学模型优化研究 被引量:4

Optimization of acoustic model for Uyghur continuous speech recognition
下载PDF
导出
摘要 综合了语音识别中常用的高斯混合模型和人工神经网络框架优点的Tandem特征提取方法应用于维吾尔语声学模型训练中,经过一系列后续处理,将原始的MFCC特征转化为Tandem特征,以此作为基于隐马尔可夫统计模型的语音识别系统的输入,并使用最小音素错误区分性训练准则训练声学模型,进而完成在测试集上的识别实验。实验结果显示,Tandem区分性训练方法使识别系统的单词错误率比原先的基于最大似然估计准则的系统相对减少13%。 This paper gives an introduction to the application of Tandem feature extraction method which holds the advantages of Gaussian mixture model and artificial neural network frameworks to Uyghur acoustic modeling. At the beginning, a series of processes convert the original Mel Frequency Cepstrum Coefficient(MFCC) feature to Tandem feature as the input to the hid- den Markov model based speech recognition system, then the acoustic model is discriminatively trained according to the mini- mum phone error discriminative criterion, finally the experiments are carried out on the test set. Experimental results show that minimum phone error trained acoustic model on Tandem feature can give a relative word error rate reduction of 13% over the maximum likelihood estimated system.
出处 《计算机工程与应用》 CSCD 2013年第2期145-147,共3页 Computer Engineering and Applications
基金 国家自然科学基金(No.61063024) 新疆多语种信息处理重点实验室开放课题(No.049807)
关键词 维吾尔语 语音识别 最小音素错误 Tandem特征 Uyghur speech recognition minimum phone error Tandem feature
  • 相关文献

参考文献7

  • 1郭人玮.最小化音素错误鉴别式声学模型学习于中文大词汇连续语音辨识之初步研究[D].中国台湾:台湾大学,2005.
  • 2Hermansky H, Ellis D P W, Sharma S.Tandem connectionist feature extraction for conventional HMM systems[C]//Acous- tics, Speech and Signal Processing, ICASSP2000.Istanbul: [s.n.],2000,3 : 1635-1638.
  • 3Ellis W, Singh R, Sivadas S.Tandem acoustic modeling in large-vocabulary recognition[C]//Acoustics, Speech and Signal Processing, ICASSP2001.Salt Lake City, Utah, USA: [s.n.], 2001,1:517-520.
  • 4Povey D, Woodland P C.Minimum phone error and I-smoothing for improved discriminative training[C]//Acoustics, Speech and Signal Processing, ICASSP2002.Orlando, Florida, USA: [s.n.], 2002,1 : 105-108.
  • 5Faria A.An investigation of Tandem MLP feature for ASR, TR-07-003 [R].USA : ICSI, 2007.
  • 6Stolcke A.SRILM-an extensible language modeling toolkit[C]// Proc Intl Conf on Spoken Language Processing.Denver:[s.n.], 2002,2:901-904.
  • 7Young S,Kershaw D,Odell J,et al.The HTK book[EB/OL]. ( 2006-08-06 ). [2011-09-20].http ://htk.eng.cam.ac.uk/.

同被引文献36

  • 1肉克艳木.买买提,热依曼.吐尔逊,吾守尔.斯拉木.维吾尔语语音标注复查软件的研究与实现[J].新疆大学学报(自然科学版),2013,30(1):87-90. 被引量:2
  • 2热依曼.吐尔逊,吾守尔.斯拉木,努尔麦麦提.多文种手机混合输入/输出技术及实现[J].计算机工程与科学,2006,28(4):103-104. 被引量:5
  • 3蔡琴,吾守尔.斯拉木.基于HTK的维吾尔语连续数字语音识别[J].现代计算机,2007,13(4):14-16. 被引量:7
  • 4郑方.连续无限制语音流中关键词识别方法研究[D],1997.
  • 5韩起,梁泉.Android系统原理及开发要点详解[M].北京:电子工业出版社,2010:340-343.
  • 6Bridle J S.An Efficient Elastic-Template Method for Detecting Given Words in Running Speech[C]Brit.Acoust.Soc.Meeting,1973.
  • 7Myers C S,Rabiner L R,Rosenberg A E.An Investigation of the Use of Dynamic Time Warping for Word Spotting and Connected Word Recognition[C]Proc.Conf.ASSP,April.1980:173-177.
  • 8Steve Young,Gunnar Evermann,Mark Gales,et al.HTKB00K[M].HTK Version 3.4.Cambridge University Engineering Department,March,2009:199-211.
  • 9Wilpon J G,Lee C H,Rabiner L R.Application of Hidden Markov Models for Recognition of a Limited Set of Words in Unconstrained Speech[C]ICASSP,1989,3(1):254-257.
  • 10Rohlicek J R,Russel W,Roukos S,et al.Continuous Hidden Markov Modeling for Speaker-Independent WordSpotting[C]ICASSP,1989,1(1):627-630.

引证文献4

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部