摘要
为了提高唇语识别的精度,研究提出了改进的话题相关的统计语言模型。基于关键词的基础上,通过主题词来划分话题,采用改进的场景训练语料设计与参数估计方法,将不同的话题的场景训练语料表示为整个场景训练语料库的模糊子集,参数估计也利用不同的话题的模糊训练集获取。改进方法较好的缓解了普通语言模型训练语料不足而引入的数据稀疏的问题,对场景训练语料与话题之间的联系强度给出了定量描述。
To improve the accuracy of lip-reading recognition, an improved topic-related statistical language mode has been re- searched. On the basis of the key words, the topic is divided by subject words, improved scene training corpus design and parameter estimation methods are used, the scene training corpus of different topics is expressed as the fuzzy subset of the whole scene training corpus, parameter estimated which can he got is also based on the fuzzy training set of different topics. The problem of sparse data which is introduced by less of training corpus in traditional language model has been eased by improved methods, quantitative de- scription about the relationship of scene training corpus and topics has been presented.
出处
《微计算机信息》
2012年第10期115-117,195,共4页
Control & Automation
关键词
唇语识别
统计语言模型
话题相关
模糊训练
lip--reading recognition
statistical language model
topic-related
fuzzy training