摘要
提出一种基于改进隐马尔可夫模型(HMM)的文本信息抽取模型。给出一个新假设,使用绝对平滑算法对模型参数进行平滑,利用Viterbi算法对观察值序列进行正序和逆序解码,基于N-Gram模型对2次解码结果进行对比消歧,得到较准确的状态序列。实验结果表明,该信息抽取模型能提高信息抽取的准确率。
This paper proposes a text information extraction model based on improved Hidden Markov Model(HMM).It gives a new assumption of observation emission.And the absolute smoothing algorithm is used to smooth the parameters of the model.The model recovers the most-likely state sequence of the observation sequence and the reverse observation sequence with the Viterbi algorithm.It compares the results with each other based on N-Gram model,and outputs a more accurate result for the state sequence.Experimental results indicate that this model has effectively improved precision.
出处
《计算机工程》
CAS
CSCD
北大核心
2011年第20期178-179,182,共3页
Computer Engineering
基金
江苏省高校自然科学基础研究基金资助项目(08KJD120004)
全国教育科学规划德育专项基金资助项目(GEA090005)
关键词
隐马尔可夫模型
绝对平滑
观察值
信息抽取
引文信息
Hidden Markov Model(HMM)
absolute smoothing
observation
information extraction
citation information