摘要
由于唇动序列和语言序列是一对多的映射,计算机自动唇读识别仅使用HMM是远远不够的。以HMM为基础,结合语言先验知识,建立了新的唇动识别模型——HLM(HMM and Bigram Language Model)。HLM突破了单纯采用HMM计算声学后验概率进行识别的传统框架,将HMM和语言背景知识紧密联系起来,依据语言模型对语言背景知识进行统计,在识别阶段融合声学后验概率和语言学先验概率进行判决。实验结果表明,HLM可使单音识别率提高7.3%,句子识别率提高19.5%。另外,采用语言模型对文字流进行解析,而不再是盲目文字匹配,单一视觉流的解析精确率达70.5%。
Since lip movement sequence and language sequence are one-to-many mapping, it is far from sufficiency to use only HMM for lip-reading recognition. Proposed a novel recognition model HLM( HMM and Bigram Language Model), which is based on HMM, and combined with prior knowledge of language. In contrary to the traditional framework, which adopts pure acoustic HMM posterior probability calculation for recognition, HLM combines closely language background knowledge and HMM. It carries on background knowledge of the language statistics according to language model. Acoustic posterior probability and linguistics prior probability are fused for judgments in the recognition stage. Experimental results demonstrated that applying HLM, syllable accuracy can increase by 7. 3%, and sentence accuracy can increas by 19. 5%. In addition, exploited language model for text flow analysis, rather than blindly text matching. In single video channel the accuracy can be up to 70. 5 %.
出处
《计算机科学》
CSCD
北大核心
2008年第12期171-174,共4页
Computer Science
基金
黑龙江省自然科学基金项目(E2005-29)
哈尔滨工业大学"新世纪人才支持计划"(NCET-05-0334)