期刊文献+

基于二阶隐马尔可夫模型的文本信息抽取 被引量:25

Text Information Extraction Based on the Second-Order Hidden Markov Model
下载PDF
导出
摘要 隐马尔可夫模型是文本信息抽取的重要方法之一.在一阶隐马尔可夫模型中,假设状态转移概率和观察值输出概率仅依赖于模型当前的状态,一定程度降低了信息抽取的精确度.而二阶隐马尔可夫模型合理地考虑了概率和模型历史状态的关联性,对错误信息有更强的识别能力.提出了基于二阶隐马尔可夫模型的文本信息抽取算法;分析了二阶隐马尔可夫模型在文本信息抽取中的有效性;仿真实验表明,新的算法比基于一阶隐马尔可夫模型的算法具有更高的抽取精确度. Hidden Markov model is one of important approaches for text information extraction.In the first-order hidden Markov model,there is the hypothesis that the transition probability of state and the output probability of observation are only dependent on the current state of the model,which debases the precision of information extraction comparatively.The relationship between the probability and the model's historical states is considered reasonably in the second-order hidden Markov model which has stronger performance of recognition for incorrect information.An algorithm of text information extraction based on the second-order hidden Markov model is proposed.The validity of the second-order hidden Markov model in information extraction is analyzed. Simulation Experiments show that the new algorithm has higher precision than the algorithm based on the first-order hidden Markov model.
出处 《电子学报》 EI CAS CSCD 北大核心 2007年第11期2226-2231,共6页 Acta Electronica Sinica
基金 国家863高技术研究发展计划(No.2006AA01Z227) 湖南省重点自然科学基金(No.06JJ20049)
关键词 文本信息抽取 一阶隐马尔可夫模型 二阶隐马尔可夫模型 精确度 text information extraction the first-order hidden Markov model the second-order hidden Markov model precision
  • 相关文献

参考文献14

  • 1Kristie Seymore, Andrew McCallum, Ronal Rosenfel. Learning hidden markov model structure for information extraction[ A]. Prpcedings of the AAAI'99 Workshop on Machine Learning for Information Extraction [ C ]. Orlando, Florida: AAAI Press, 1999,37-42.
  • 2Dayne Frietag, Andrew McCallum. Information extraction with HMMs and shrinkage [ A]. Proceedings of the AAAI' 99 Workshop on Machine Learning for Information Extraction[ C ]. Orlando: AAAI Press, 1999.31-36.
  • 3Freitag D, McCallum A. Information extraction with HMM structures learned by stochastic optimization[ A ]. Proceedings of the Eighteenth Conference on Artificial Intelligence [ C ]. Edmonton: AAAI Press, 2002. 584-589.
  • 4刘云中,林亚平,陈治平.基于隐马尔可夫模型的文本信息抽取[J].系统仿真学报,2004,16(3):507-510. 被引量:51
  • 5Scheffer T, Decomain C, Wrobel S. Active hidden Markov models for information extraction [ A ]. Proceedings of the Fourth International Symposium on Intelligent Data Analysis [ C] .Berlin: Springer, 2001.309-318.
  • 6林亚平,刘云中,周顺先,陈治平,蔡立军.基于最大熵的隐马尔可夫模型文本信息抽取[J].电子学报,2005,33(2):236-240. 被引量:48
  • 7Lafferty J,McCallum A, Pereira F. Conditional random fields: Probabilisfic models for segmenting and labeling sequence data [A] .Proceedings of the 18th ICML[ C ]. San Francisco: Morgan Kaufmann,2001,282-289.
  • 8McCallum A, Li W. Early results for named entity recognition with conditional random fields, feature induction and. Web-enhanced lexicons[A]. Proceedings of the 7th CoNLL[C]. Edmonton, Canada: Morgan Kaufmann, 2003. 188-191.
  • 9洪铭材,张阔,唐杰,李涓子.基于条件随机场(CRFs)的中文词性标注方法[J].计算机科学,2006,33(10):148-151. 被引量:56
  • 10周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112

二级参考文献50

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:197
  • 2杨行峻 迟惠生 等.语言信号数字处理[M].北京:电子工业出版社,..
  • 3[1]A. McCallum, K. Nigam, J. Rennie, and K. Seymore. A machine learning approach to building Domain-Specific Search Engines [A]. In Proceedings of IJCAI-99 [C]. 622-667.
  • 4[2]Ellien Riloff. Automatically Constructing a Dictionary for Information Extraction Task [A]. Proceeding for the Eleventh National Conference on Artificial Intelligence [C]. 1993. 811-816.
  • 5[3]E. Riloff , R. Jones. Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping [A]. Proceedings of the Sixteenth National Conference on Artificial Intelligence [C]. 1999. 811-816.
  • 6[4]S. Soderland. Learning information extraction rules for semi-structured and free text [J]. Machine Learning, 1999, 1-44.
  • 7[5]Kushmerick, N. Wrapper induction: efficiency and Expressiveness [J]. Artificial Intelligence,2000, Vol. 118, pp. 15--68.
  • 8[6]Leek,T. R. Information Extraction Using Hidden Markov Models [D]. Master's thesis, UC san Diego,1997.
  • 9[7]Kristie Seymore, Andrew McCallum, Ronal Rosenfel. Learning Hidden Markov Model Structure for Information Extract [A]. AAAI' 99 Workshop on Machine Learning for Information Extraction [C]. 1999. 37-42.
  • 10[8]Dayne Frietag, Andrew McCallum. Information Extraction with HMMs and shrinkage [A]. In Proceedings of the AAAI'99 Workshop on Machine Learning for Information Extraction [C], 1999, pp. 31-36.

共引文献249

同被引文献234

引证文献25

二级引证文献133

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部