摘要
基于标签传播的半监督学习算法能够提升少量标注数据下的关系抽取效果,但是随机选择训练样本会使关系抽取性能降低。为了从海量的网络信息中提取出可靠性较高的人物关系,将标签传播算法与主动学习相结合用于人物关系抽取。在训练数据获取中,主动选择不确定性最大的样本进行标注。在人物关系上的实验结果显示,主动学习方法的引入可使平均F1值比标签传播算法提升2.3%。
In order to extract personal relations of high reliability from the mass network information,the semisupervised learning algorithm based on label propagation can improve the performance of relation extraction under small amount of labeled data,but randomly selecting training sample may cause the reduction of the relation extraction performance.This paper combines label propagation algorithm and active learning so as to extract the relationship between the characters.In the training data acquisition,the maximum uncertainty of the sample is actively selected for label.Experimental results on personal relation show that the active learning method improves the average F1 by 2.3% than label propagation algorithm.
出处
《计算机工程》
CAS
CSCD
北大核心
2017年第2期234-240,共7页
Computer Engineering
基金
国家自然科学基金(61332004)
关键词
人物社会关系
特征提取
标签传播
主动学习
关系抽取
半监督学习
personal social relation
feature extraction
label propagation
active learning
relation extraction
semi-supervised learning