期刊文献+

基于CRF的网页动态关系抽取研究 被引量:2

CRF based dynamic relations extraction from web
下载PDF
导出
摘要 提出了基于条件随机场(conditional random fields,CRF)的网页动态关系抽取算法.给出了动态关系的定义,建立了动态关系的表示模型,并用一个六维结构来表达动态关系.与传统关系抽取中基于规则或者基于分类的解决方法不同,本文认为可以将动态关系识别问题转化为一个标注问题,并提出了基于CRF的句子层面的关系标注和抽取方法.在本算法中,首先将一个句子通过语义角色标注(semantic role labeling,SRL)系统进行成分识别,然后将语义角色标注结果以及词的POS类型、词组的命名实体类型等作为CRF的训练特征,对句子成分进行标注.最后测试了大量的真实新闻网页,实验结果表明了本文提出算法的实用性和有效性. New methods for extracting dynamic relations from web resources such as news pages were proposed.A relation was defined as dynamic if its instances changed over time.An example was the employment relation between people and companies.The nature of dynamic relations required the extraction methods to capture the temporal context of the relation.While most previous work on this topic has been domain-specific,a domain-independent,general approach was proposed using a conditional random fields(CRF)based technique.Experiment results show the practicality and precision of the proposed approach by experiments with news pages from the web.
出处 《中国科学技术大学学报》 CAS CSCD 北大核心 2010年第11期1197-1202,共6页 JUSTC
基金 国家重点实验室开放课题(2009006) 国家自然科学基金(60776801 70803001) 北京市"现代信息科学与网络技术"重点实验室暨铁道部"铁路信息科学与工程"开放实验室开放基金(XDXX1005)资助
关键词 条件随机场 关系抽取 语义角色标准 conditional random fields relation extraction semantic role labeling
  • 相关文献

参考文献3

二级参考文献31

  • 1刘群,张华平,俞鸿魁,程学旗.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. 被引量:198
  • 2俞鸿魁,张华平,刘群,吕学强,施水才.基于层叠隐马尔可夫模型的中文命名实体识别[J].通信学报,2006,27(2):87-94. 被引量:157
  • 3周俊生,戴新宇,尹存燕,陈家骏.基于层叠条件随机场模型的中文机构名自动识别[J].电子学报,2006,34(5):804-809. 被引量:112
  • 4沈达阳 孙茂松 黄昌宁.中文地名的自动识别[A]..计算语言学进展与应用[C].北京:清华大学出版社,1995..
  • 5Borthwick A,Sterlin J,Agiehtein E,et al.NYU:description of the MENE named entity system as used mUC-7[C]//Proceedings of the 7th Message Understanding Conference(MUC-7).Washington D C,1998:145-150.
  • 6Viola P,Narasimhand M.Leaming to extract information from semistructured text using a discriminative context free grammar[C]// Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,2005:330-337.
  • 7Black W J,Rinaldi F,Mowatt D.FACILE:Description of the NE System used for MUC-7[C]//Proceedings of the MUC-7,Washington D C,1998:115-120.
  • 8McCallum A,Freitag D,Pereira F.Maximum entropy Markov models for information extraction and segmentation[C]//Proc of 17th ICML.Stanford,California,USA:Morgan Kaufmann,2000:591-598.
  • 9Wang Houfeng,Shi Wuguang.A simple rule-based approach to organization name recognition in chinese text[A].Proc of 5th CICLing[C].LNCS 3406,Heidelberg,German:Springer-Verlag,2005.769-772.
  • 10Hongkui Yu,Huaping Zhang,Quan Liu.Recognition of Chinese organization name based role tagging[A].Proc of Advances in Computation of Oriental Languages[C].Beijing:Tsinghua University Press,2003.79-87.

共引文献136

同被引文献26

引证文献2

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部