期刊文献+

基于BIC的语音识别模型压缩算法

Speech Recognition Model Compression Algorithm Based on Bayesian Information Criterion
下载PDF
导出
摘要 当对HMM(Hidden Markov Model,隐马尔科夫模型)语音模型进行GMM(Gaussian Mixture Model,混合高斯模型)区分训练增加组件时,语音模型的识别率会随着GMM的组件增多而增加,模型的大小也会增加,这就造成了语音模型的臃肿。而在移动端使用本地语音模型进行识别时,存放一个几百兆的模型很不合适。针对上述问题,本文提出将一个GMM组件数较多的语音模型利用BIC准则压缩到指定的组件数,从而在模型大小合适的情况下尽量保证模型的识别率。实验结果表明,使用本方法进行压缩之后的语音识别率比未压缩的相同组件数的语音识别模型的识别率要高。 Recognition rate of speech model will increase with the increase in the number of GMM components, the size of model will increase as well, when making the GMM recognition training for HMM speech model, and it causes model bloated. However, it is unfit for mobile devices while using speech model for recognition to keep greater than hundreds of megabytes in mobile. For this problem, a method for compress speech model based on BIC is presented. This method tries to keep recognition rate of speech model in appropriate to the size of model. Experiments demonstrate that it' s applicable and available to achieve the final speech model specified size even ensure recognition rate of speech model as much as possible.
作者 邹灿 李柏岩
出处 《计算机与现代化》 2014年第6期71-73,78,共4页 Computer and Modernization
关键词 语音识别 模型压缩 BIC(贝叶斯信息准则) speech recognition model compress BIC (bayesian information criterion)
  • 相关文献

参考文献16

  • 1Jurafsky D, Martin.Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition(2nd ed)[M]. Prentice Hall,2008.
  • 2Juang B H, Rabiner L R. Hidden Markov models for speech recognition[J]. Technometrics, 1991,33(3):251-272.
  • 3Xie Chen, Adam Eversole, Gang Li,et al. Pipelined Back-Propagation for Context-Dependent Deep Neural Networks[DB/OL]. http://research.microsoft.com/apps/pubs/?id=173312, 2012-09-10.
  • 4Gideon Schwarz. Estimating the dimension of a model[J]. The Annals of Statistics, 1978,6(2):461-464.
  • 5Akaike H. A new look at the statistical identication model[J]. IEEE Transactions on Automatic Control, 1974,19(6):716-723.
  • 6Jin H, Kubala F, Schwartz R. Automatic speaker clustering[C]// Proceedings of the 1997 DARPA Speech Recognition Workshop. 1997:108-111.
  • 7Legetter C J, Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[J]. Computer Speech and Language,1995,9(2):171-185.
  • 8Geoffrey J McLachlan, Thriyambakam Krishnan. The EM Algorithm and Extensions(2nd ed)[M]. Wiley, 2008.
  • 9Lawrence Rabiner, Biing-Hwang Juang. Fundamentals of Speech Recognition[M]. USA: Prentice Hall, 1993.
  • 10Akaike H. A new look at the statistical identification[J]. IEEE Transactions on Automatic Control, 1974,19(6):716-723.

二级参考文献25

  • 1王作英.基于段长分布的HMM语音识别模型.第二届全国汉字语音识别会议[M].庐山,1989..
  • 2-.智能机研究动态.第五届全国汉字识别、语音识别与合成系统及自然语言处理系统评测结果[M].,1994,4..
  • 3Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon. Spoken Language Processing: A Guide to Theory,Algorithm, and System Development[M], Prentice Hall PTR, 2001.
  • 4J. M. Huerta and R. M. Stem, Distortion-Class Modeling for Robust Speech Recognition under GSM RPE-LTP Coding[J ], in Speech Communication, 2001,34( 1 - 2) :213 - 225.
  • 5V. Digalakis, P. Monaco and H. Murveit, Genones: generalized mixture tying in continuous hidden markovmodel-based speech recognizers [J ], IEEE Transactions on Speech and Audio Processing, July 1996,4, (4) :281 - 289.
  • 6W. Reichl and W. Chou, Robust Decision Tree State Tying for Continuous Speech Recngnition[J], IEEE Trans. Speech and Audio Proc. , 2000,8(5) :555 - 566.
  • 7J. Park H. Ko, CONSTRUCTION OF DECISION TREE FROM DATA DRIVEN CLUSTERING[ C], ICSLP 2002,2657 - 2660.
  • 8J. T Chien, C. H Huang, and S. J Chen, COMPACT DECISION TREES WITH CLUSTER VALIDITY FOR SPEECH RECOGNITION[C], ICCASP 2002,2462 - 2465.
  • 9S. Gao,J. S Zhang, S. Nakamura,C. H Lee, T. S Chua, Weighted Graph Based Decision Tree Optimization for High AccuracyAeoustic Modeling[C], ICSLP2002,1233 - 1236.
  • 10A. Kannan, M. Ostendorf, and J. R. Rohlicek, Maximum Likelihood Clustering of Gaussians for Speech Recognition[J], IEEE Transactions on Speech and Audio Processing,July 1994,2(3):453- 355.

共引文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部