期刊文献+

基于条件随机域CRF模型的文本信息抽取 被引量:8

Using conditional random fields model for text information extraction
下载PDF
导出
摘要 为了抽取文本中的信息,在分析对比了4种统计建模原型后,选用条件随机域CRF建立抽取模型,提出了一种文本信息抽取的方法。该方法对文本分析后加标注,确定文本特征集,采用有限内存拟牛顿迭代方法L-BFGS算法估计CRF模型参数,根据训练学习得出的模型,实现科研论文数据集头部文本信息的抽取。实验结果表明,使用CRF模型的抽取准确率达到90%以上,远远高于使用HMM模型的抽取准确率。 In order to extract the information from the text, a method based on conditional random fields (CRF) statistical model is presented. In this method, the text is labeled to determine the features space and one of the limited memory quasi-Newton methods called L-BFGS algorithm is used to estimate the parameter of the CRF model. According to the trained CRF model, various common fields from the research paper headers are extracted. The experimental result indicated that the precision rate of using CRF model achieved more than 90%, which is much better than that of HMM model.
出处 《计算机工程与设计》 CSCD 北大核心 2008年第23期6094-6097,共4页 Computer Engineering and Design
关键词 条件随机域 文本信息抽取 参数估计 L—BFGS迭代法 特征集 conditional random fields text information extraction parameter estimation L-BFGS iterative method features space
  • 相关文献

参考文献8

  • 1Freitag D,McCallum A.Information extraction with HMM structures learned by stochastic optimization[C]. Proceedings of the Eighteenth Conference on Artificial Intelligence. Edmonton: AAAI Press,2002:584-589.
  • 2Souyma Ray, Mark Craven. Representing sentence structure in hidden Markov models for information extraction[C]. Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence.Washington:Morgan Kaufmann, 2001:1273- 1279.
  • 3Scheffer T, Decomain C,Wrobel S.Active hidden Markov models for information extraction[C].Proceedings of the Fourth International Symposium on Intelligent Data Analysis. Lisbon: Springer,2001:301-109.
  • 4Freitag D, McCallum A, Pereira F. Maximum entropy Markov models for information extraction and segmentation [C]. Proceedings of the Seventeenth International Conference on Machine Leaming. San Francisco: Morgan Kaufmann, 2000: 591- 598.
  • 5Lafferty J, MeCallum A, Pereira F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]. Proceedings of ICML,2001:282-289.
  • 6Liu D,Nocedal J.On the limited memory BFGS method for large scale optimization [J]. Mathematical Programming, 1998,45: 503-528.
  • 7http://www.chasen.org/-taku/software/CRF++/[EB/OL].
  • 8McCallum A. Efficiently inducing features of conditional random fields[C]. Proceedings of Conference on Uncertainty in Articifical Intelligence,2003.

同被引文献56

引证文献8

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部