期刊文献+

基于特征选择的人物关系抽取方法 被引量:7

People Relation Extraction Method Based on Feature Selection
下载PDF
导出
摘要 在人物关系抽取中,其特征空间维度往往非常高,会造成向量稀疏问题,从而影响关系抽取的效率。针对这一现象,首先将人物关系分为6类;然后引入了文档频率、信息增益、互信息和χ2统计这四种文本文类的特征选择算法,对特征空间进行降维。最后运用SVM分类器抽取人物的实体关系。实验结果表明这四种特征选择算法不仅能够保证抽取性能,还能有效的降低向量空间维数,极大提高关系抽取效率。其中,χ2统计算法效果最佳,信息增益次之。 In the people relation extraction,the spatial dimension of feature is often very high.resulting in sparse vector problem,which will affect the relationship extraction efficiency.In response to this phenomenon,the first,character relationships are divided into six categories,and then Introduced document frequency,information gain,mutual information and χ^2statistics of these four feature selection algorithm to educe the dimension of the feature space.Finally,the use of SVM classifier to extract the people entity relationship.Experimental results show that the four feature selection algorithm not only can guarantee extraction performance,but also effectively reduce the vector space dimension drops and dramatically improve the relation extraction efficiency.Which,χ^2statistical algorithm works best,followed by information gain.
出处 《科学技术与工程》 北大核心 2015年第3期254-259,共6页 Science Technology and Engineering
基金 国家自然科学基金项目(61363072) 教育部人文社科基金(11YJC740157) 江西省自然科学基金(20114BAB201027)资助
关键词 关系抽取 SVM 特征选择 多分类 relation extraction SVM feature selection multi-classification
  • 相关文献

参考文献4

二级参考文献37

  • 1车万翔,刘挺,李生.实体关系自动抽取[J].中文信息学报,2005,19(2):1-6. 被引量:116
  • 2姜吉发,王树西.一种自举的二元关系和二元关系模式获取方法[J].中文信息学报,2005,19(2):71-77. 被引量:5
  • 3曾兴杰,李芳,张冬茉.采用开放语料库的跨领域模式自动获取[J].计算机仿真,2005,22(4):259-263. 被引量:1
  • 4董静,孙乐,冯元勇,黄瑞红.中文实体关系抽取中的特征选择研究[J].中文信息学报,2007,21(4):80-85. 被引量:55
  • 5ACE 2005. The automatic content extraction (ACE) projects[ EB/OL]. (2007 - 10 - 11) [ 2009 - 03 - 10]. http// www. ldc. upenn. edu/Projects/ACE/.
  • 6Nanda Kambhatla. Combining lexical, syntactic and semantic features with maximum entropy models for extracting relations[C]//Proc. Of ACL'2004. Barcelona, Spain: [s. n. ],2004.
  • 7Wang Ting, Li Yaoyong, Kalina Bontcheva, et al. Automatic extraction of hierarchical relations from text [ C]// Proc. of the Third European Semantic Web Conference. Jeju Island, South Korea: [s. n. ] ,2005.
  • 8In: Proceedings of the 6th Message Understanding Conference (MUC - 7) [ C ]. National Institute of Standars and Technology, 1998.
  • 9C. Aone and M. Ramos-Santacruz. Rees: A large-scale relation and event extraction system[A]. In: Proceedings of the 6th Applied Natural Language Processing Conference[C] ,pages 76- 83, 2000.
  • 10S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, R. Weischedel, and the Annotation Group.Algorithms that learn to extract information-BBN: Description of the SIFT system as used for MUC[ A]. In: Proceedings of the Seventh Message Understanding Conference (MUC-7)[C], 1998.

共引文献131

同被引文献27

引证文献7

二级引证文献23

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部