摘要
为了解决传统的半监督关系抽取算法易产生的"语义变异"问题,提出一种新的基于树核函数的半监督关系抽取算法。该算法主要采用树核函数和种子集约束扩展两个策略,弱化"语义变异"现象带来的关系抽取不够准确的问题,提高关系识别的正确率。在基准数据集Pop Bank上的试验研究表明,提出的使用约束机制扩充种子集的半监督学习方法在4个评价指标上(Precision,Recall,F-measure,Accuracy)均优于常用的两种关系抽取方法,从而验证了该算法与其他算法相比能够具有较好的关系抽取能力。
It was difficult for traditional semi-supervised relation extraction methods to solve "semantic variation" prob- lem. A new semi-supervised relation extraction algorithm based on ensemble learning was prorosed and named L-EC- RE, which used two strategies, one was tree kernel and the other was constrained extension seed set. Experimental study on PopBank benchmark data sets showed that L-EC-RE had better performance than two usual relation extraction algorithms in four assessment criteria, which were Precision, Recall, F-measure and Accuracy.
出处
《山东大学学报(工学版)》
CAS
北大核心
2015年第2期22-26,32,共6页
Journal of Shandong University(Engineering Science)
基金
广东高校优秀青年教师培养计划资助项目(Yq2013108)
关键词
关系抽取
树核函数
支持向量机
半监督方法
语义变异
:relationship extraction
tree kernel
support vector machine
semi-supervised method
semantic variation