期刊文献+

模糊语言模型在唇读系统中的应用 被引量:1

The Application of Fuzzy Language Model in Lip-reading
下载PDF
导出
摘要 论文针对传统的统计语言模型所面临的数据稀疏和估计严苛性问题,提出基于模糊表示的n-元语法模型,并将其应用于唇语识别系统中,结合隐马尔科夫模型(Hidden Markov Model),建立了新的唇动识别模型—HFM(HMM and Fuzzy Language Model)。利用教育部语言文字应用研究所计算语言学研究室研制的语料库在线系统,制作了一个小型语料库,进行了句子识别实验。实验结果表明,HFM可使单音识别率最高提高6.5%,句子识别率最高提高22.7%,另外,采用语言模型对文字流进行解析,而不再是盲目文字匹配,单一视觉流的解析精确度达68.7%。 In this paper,we present a n-gram model based on fuzzy representation,in allusion to the problem of data sparsity and sharply of maximum likelihood estimation that the traditional statistical language model confront. We apply it to the lip reading system,combine with Hidden Markov Model( HMM),establish a novel lip movement recognition model HFM( HMM and Fuzzy Language Model). A small vocabulary corpus was built by using the corpus online system developed by the Ministry of Education Institute of Applied Linguistics Computational Linguistics Research Laboratory for carrying out sentence recognition experiments. The experimental results demonstrate that HFM( did not need smoothing) can improve syllable recognition rate by up to 6. 5%,and sentence recognition rate by up to 22. 7%. In addition,using language model for text stream analysis,instead of blindly text matching,analytical accuracy of single visual flow can be up to 68. 7%.
出处 《信号处理》 CSCD 北大核心 2015年第10期1301-1306,共6页 Journal of Signal Processing
基金 江苏省自然科学基金(bk2012511)资助课题
关键词 唇语识别 模糊语言模型 隐马尔科夫模型 语料库 lip-reading fuzzy language model hidden Markov model corpus
  • 相关文献

参考文献9

  • 1Jong-Seok Lee. Visual-speech-pass filtering for robust au- tomatic lip-reading [J]. Pattern Analysis and Applica- tions, 2014, 17(3) :611-621.
  • 2Sunil S. Morade, Suprava Patnaik. A novel lip reading algorithm by using localized ACM and HMM: Tested for digit recognition[J]. Optik-International Journal for Lightand Electron Optics, 2014, 125(18) :5181-5186.
  • 3Chuanzhen Rong, Zhenjun Yue. A Novel Feature Selec- tion and Extraction Method for Sequence Images of Lip- reading[ C ] //Advances in Automation and Robotics. 2011:347-353.
  • 4王丹,姚鸿勋,万玉奇,洪晓鹏.唇读中的HLM模型及其文字流解析[J].计算机科学,2008,35(12):171-174. 被引量:1
  • 5肖航.语料库在线[EB/OL].http://www.cncorpus.org/CCindex.aspx,2015.
  • 6孙晓鹏,安丹丹,刘小丹.拼音文本驱动的任意嘴唇曲线的动画生成[J].计算机辅助设计与图形学学报,2008,20(12):1603-1608. 被引量:2
  • 7李皓,陈艳艳,唐朝京.唇部子运动与权重函数表征的汉语动态视位[J].信号处理,2012,28(3):322-328. 被引量:12
  • 8Benjamin Pieart, Thomas Drugman, Thierry Dutoit. Anal- ysis and HMM-based synthesis of hypo and hyperarticulat- ed speech[J]. Computer Speech & Language, 2014, 28 (2) :687-707.
  • 9Yuan Ge, Qigong Chen, Ming Jiang, et al. SCHMM- based modeling and prediction of random delays in net- worked control systems[J]. Journal of Franklin Institute, 2014, 351 (5) :2430-2453.

二级参考文献30

共引文献11

同被引文献16

  • 1Bengio Y, Ducharme R, Vincent P, et al. A neural prob- abilistic language model[ J]. Journal of Machine Learning Research, 2003, 3(2): 1137-1155.
  • 2JeffKuo H K, Ansoy E, Emami A, et al. Large scale hier- archical neural network language models [ C ]//In: Proceed- ings of the 2012 Annual Conference of International Speech Communication Association. Portland, USA: ISCA, 2012: 1672-1675.
  • 3Hai-Son Le, Oparin I, Allauzen A, et al. Structured out- put layer neural network language model [ C ] //IEEE Transactions on Speech and Audio Processing, 2013, 21 ( 1 ) : 195-204.
  • 4Mikolov T, Karafiat M, Burget L, et al. Recurrent neural network based language model [ C ]//In : Proceedings of the 2010 Annual Conference of International Speech Com- munication Association. Makuhari, Chiba, Japan: ISCA, 2010 : 1045-1048.
  • 5Mikolov T, Kombrink S, Burget L, et al. Extensions of recurrent neural network language model [ C ]//In : Pro- ceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague, Czech Republic: IEEE, 2011: 5528-5531.
  • 6Hochreiter S, Bengio Y, Frasconi P, et al. Gradient flow in recurrent nets: the difficulty of learning long-term de- pendencies[ M]. 3. Field Guide to Dynamical Recurrent Neural Networks. Piscataway, N.J. IEEE Press, 2001 : 237-243.
  • 7Zen H, Sak H. Unidirectional long short term memory re- current neural network with recuirent output layer for low latency speech synthesis [ C ]//In: Proceedings of the 2015 Annual Conference of International Speech Commu-nication Association. Brisbane, Australia: ISCA, 2015 : 4470-4474.
  • 8Xiang-Gong Li, Xi-Hong Wu. Improving long short-term memory networks using maxout units for large vocabulary speech recognition[ C ]//In: Proceedings of the 2015 Annual Conference of International Speech Conununication Associa- tion. Brisbane, Australia: ISCA, 2015:4600-4604.
  • 9Arisoy E, Sethy A, Ramabhadran B, et al. Bidirectional re- current neural network language models for automatic speech recognition [ C ]//In : Proceedings of the 2015 Annual Con- trence of International Speech Conununication Association. Brisbane, Australia: ISCA, 2015:5421-5425.
  • 10Jian Zhang, Dan Qu, Zhen Li. An improved recurrent neural network language model with context vector fea- tures[ C]//In: Proceedings of the 2014 IEEE Interna- tional Conference on Software Engineering and Service Science. Beijing, China: IEEE, 2014:828-831.

引证文献1

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部