期刊文献+

唇读中的HLM模型及其文字流解析 被引量:1

Lipreading HLM and Text Flow Analysis
下载PDF
导出
摘要 由于唇动序列和语言序列是一对多的映射,计算机自动唇读识别仅使用HMM是远远不够的。以HMM为基础,结合语言先验知识,建立了新的唇动识别模型——HLM(HMM and Bigram Language Model)。HLM突破了单纯采用HMM计算声学后验概率进行识别的传统框架,将HMM和语言背景知识紧密联系起来,依据语言模型对语言背景知识进行统计,在识别阶段融合声学后验概率和语言学先验概率进行判决。实验结果表明,HLM可使单音识别率提高7.3%,句子识别率提高19.5%。另外,采用语言模型对文字流进行解析,而不再是盲目文字匹配,单一视觉流的解析精确率达70.5%。 Since lip movement sequence and language sequence are one-to-many mapping, it is far from sufficiency to use only HMM for lip-reading recognition. Proposed a novel recognition model HLM( HMM and Bigram Language Model), which is based on HMM, and combined with prior knowledge of language. In contrary to the traditional framework, which adopts pure acoustic HMM posterior probability calculation for recognition, HLM combines closely language background knowledge and HMM. It carries on background knowledge of the language statistics according to language model. Acoustic posterior probability and linguistics prior probability are fused for judgments in the recognition stage. Experimental results demonstrated that applying HLM, syllable accuracy can increase by 7. 3%, and sentence accuracy can increas by 19. 5%. In addition, exploited language model for text flow analysis, rather than blindly text matching. In single video channel the accuracy can be up to 70. 5 %.
出处 《计算机科学》 CSCD 北大核心 2008年第12期171-174,共4页 Computer Science
基金 黑龙江省自然科学基金项目(E2005-29) 哈尔滨工业大学"新世纪人才支持计划"(NCET-05-0334)
关键词 唇读 识别模型 HLM HMM Lipreading, Recognition model, HLM, HMM
  • 相关文献

参考文献13

  • 1Potamianos G, et al. Audio - Visual Automatic Speech Reeognition: An Overview [M]. MIT Press, 2004
  • 2Potamianos G,Graf H P, Cosatto E. An Image Transform Approach for HMM Based Automatic Lipreading[C]//Proc. Int. Conf. Image Processing. 1998,1:173-177
  • 3Potamianos G, Neti C. Improved ROI and Within Frame Discriminant Features for Lipreading[C]//Proc. Int. Conf. Image Processing. Thessaloniki, Greece, 2001,3 : 250-253
  • 4姚鸿勋,高文,王瑞,郎咸波.视觉语言——唇读综述[J].电子学报,2001,29(2):239-246. 被引量:30
  • 5Potamianos G, et al. Recent Advances in the Automatic Recognition of Audio-visual Speech[C]. Proc. of the IEEE, 200a, 91 (9) : 1306-1326
  • 6Rosenfeld R. A Maximum Entropy to Adaptive Statistical Language Learning[C]. Computer Speech and Language, 1996, 10 (3) : 187-228
  • 7Chomsky N. Aspects of the Theory of Syntax [M]. Cambridge: MIT Press, 1965
  • 8Chomsky N. Syntactic structures[M]. Mouton, 1964
  • 9黄昌宁,张小凤.自然语言处理技术的三个里程碑[J].外语教学与研究,2002,34(3):180-187. 被引量:20
  • 10Hong Xiaopeng, Yao Hongxun, Wan Yuqi, et al. A PCA Based Visual DCT Feature Extraction Method for Lip-reading[C]//Int. Conf. on Intelligent Information Hiding and Multimedia Signal Processing. 2006

二级参考文献14

  • 1董振东.汉语分词研究漫谈[J].语言文字应用,1997(1):109-114. 被引量:11
  • 2王瑞.连续语音唇读识别的研究.哈尔滨工业大学计算机系博士论文开题报告[M].哈尔滨工业大学档案馆,1998..
  • 3徐彦君.中文双语料语音识别关键技术研究:博士论文[M].北京:中科院语音所,1998..
  • 4间濑健二.读唇[J].电子情报通信学会论文志,1990,73(6):796-803.
  • 5Yao H,IEEE Fourth Int Conference on Signal Processing,1998年,912页
  • 6徐彦君,博士学位论文,1998年
  • 7王瑞,博士论文开题报告,1998年
  • 8Liu M B,计算机学报,1998年,21卷,6期,527页
  • 9Li N,http://www.cs.ucf.edu/~vision/papers/shah/97/NDS97 pdf,1997年
  • 10Chiou G I,IEEE Trans Image Processing,1997年,6卷,8期,1192页

共引文献48

同被引文献8

  • 1Jong-Seok Lee. Visual-speech-pass filtering for robust au- tomatic lip-reading [J]. Pattern Analysis and Applica- tions, 2014, 17(3) :611-621.
  • 2Sunil S. Morade, Suprava Patnaik. A novel lip reading algorithm by using localized ACM and HMM: Tested for digit recognition[J]. Optik-International Journal for Lightand Electron Optics, 2014, 125(18) :5181-5186.
  • 3Chuanzhen Rong, Zhenjun Yue. A Novel Feature Selec- tion and Extraction Method for Sequence Images of Lip- reading[ C ] //Advances in Automation and Robotics. 2011:347-353.
  • 4肖航.语料库在线[EB/OL].http://www.cncorpus.org/CCindex.aspx,2015.
  • 5Benjamin Pieart, Thomas Drugman, Thierry Dutoit. Anal- ysis and HMM-based synthesis of hypo and hyperarticulat- ed speech[J]. Computer Speech & Language, 2014, 28 (2) :687-707.
  • 6Yuan Ge, Qigong Chen, Ming Jiang, et al. SCHMM- based modeling and prediction of random delays in net- worked control systems[J]. Journal of Franklin Institute, 2014, 351 (5) :2430-2453.
  • 7孙晓鹏,安丹丹,刘小丹.拼音文本驱动的任意嘴唇曲线的动画生成[J].计算机辅助设计与图形学学报,2008,20(12):1603-1608. 被引量:2
  • 8李皓,陈艳艳,唐朝京.唇部子运动与权重函数表征的汉语动态视位[J].信号处理,2012,28(3):322-328. 被引量:12

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部