期刊文献+

实体关系的自动抽取研究 被引量:10

Study about automatic entity relation extraction
下载PDF
导出
摘要 针对实体关系的自动获取难题,将极大熵算法和Bootstrapping算法相结合,利用Bootstrapping算法和标量聚类的思想,通过设置种子模板和种子词获取了极大熵算法中所需的特征词.结合极大熵算法,从语言的形态学、语法、语义等方面系统地设计了9个特征,尽可能全方位地描述文实体的真实情况.搭建了实验所需的系统框架,实现了实体关系的自动抽取.实验结果表明:该方法能够有效地解决实体关系的自动生成问题. Entity Relation Extraction is solved in this paper. This approach is very different from previous one; the Maximum Entropy (ME)-based machine learning is combined with the Bootstrapping algorithm. Based on the Bootstrapping algorithm, seed words and seed patterns are used to build a learning program, which extracts more characteristic words using Scalar Clusters as the important feature of ME algorithm. These characteristic words have semantic similarity with seed words. Moreover, combined the ME algorithm, nine features have been designed for entity relation extraction in this paper, which include morphology, grammar and semantic feature, etc. The system architecture used for entity relation extraction has been constructed. Experiment shows that the performance is promising. So it is useful to extract automatic entity relation.
出处 《哈尔滨工程大学学报》 EI CAS CSCD 北大核心 2006年第B07期370-373,共4页 Journal of Harbin Engineering University
基金 国家863计划计算机主题重大基金资助项目(2001AA114210).
关键词 极大熵 BOOTSTRAPPING 特征选择 实体关系抽取 评测 maximum entropy Bootstrapping feature select entity relation extraction evaluation
  • 相关文献

参考文献11

  • 1AONE C,RAMOS S M.Rees:a large-scale relation and event extraction system[A].In proceedings of the 6th Applied Natural Language Processing Conference[C],Washington,2000.
  • 2车万翔,刘挺,李生.实体关系自动抽取[A].第一届全国内容安全与信息检索学术会议[C].上海,2004.
  • 3ZELENKO D,AONE C,RICHARDELLA A.Kernel methods for relation extraction[J].Journal of Machine Learning Research,2003,3:1083-1106.
  • 4THELEN M,RILOFF E.A bootstrapping method for learning semantic lexicons using extraction pattern contexts[A].In proceedings of the 2002 conference on empirical methods in natural language processing[C].Philadelphia,2002.
  • 5YANGARBER R,GRISHMAN R,TAPANAINEN P,HUTTUNEN S.Unsupervised discovry of scenario-level Patterns for information extraction[A].In proceedings of he applied natural language processing conference[C].Seattle,WA.2000.
  • 6YANGARBER R,GRISHMAN R.Machine learning of extraction patterns from unannotated corpora[A].Proc.workshop on machine.learning and information extraction[C].Orlando,AAAI-99.
  • 7ROSIE J,ANDREW M C,KAMAL N,et al.Bootstrapping for text learning tasks[A].IJCAI-99 workshop on text mining:foundations,techniques and applications[C].[s.l].,999.
  • 8BERGERAL,PIERTRAD S A,PIETRAD V J.A maximum entropy approach to natural language processing[J].Computational Linguistics,1996,22 (1):39-71.
  • 9钟义信.自然语言理解的全信息方法论[J].北京邮电大学学报,2004,27(4):1-12. 被引量:42
  • 10SODLERLAND S.Learning information extraction rules for semi-structured and free text[J].Machine Learning,1999:1-44.

二级参考文献2

  • 1[1]Shannon C E.A mathematical theory of communication[J],Bell System Tech J,1948,27:379-423,623-656.
  • 2[3]Wiener N.Cybernetics[M].New York:Wiley,1948.

共引文献42

同被引文献58

引证文献10

二级引证文献66

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部