摘要
远程监督关系抽取可以在非人工标注条件下自动构建数据集,而基线模型SENT可以将负训练思想引入到该场景进行关系抽取任务。但基线模型使用双向长短期记忆网络提取特征,主要注重长距离依赖关系的学习,在关注局部上下文中的特征方面存在不足,对于局部特征捕捉不够充分;同时基线模型在负训练训练模型时,未能重点关注与互补标签相关的特征,对互补标签信息的学习不足,这影响了对噪声数据的识别能力。为了解决这些问题,文中引入卷积神经网络,通过卷积核在输入关系序列上进行卷积操作,从而捕捉到输入关系实例中的局部信息,提高模型对于输入数据的局部特征学习能力。针对模型对互补标签特征未能关注的问题,引入逆向注意力机制,通过调整与互补标签相关的隐藏单元的权重,使模型能够有选择性地关注与互补标签相关的信息,从而提高模型对基于互补标签的噪声数据的识别性能,进一步改善模型关系抽取性能。通过NYT10数据集对所设计的方法进行了验证,结果表明,所提方法相较于基线模型在NYT10数据集关系抽取任务中F1值提高了4.84%,有效地提高了模型远程监督关系的抽取能力。
The distantly supervised relation extraction can automatically construct datasets under non manual annotation conditions.The baseline model SENT(sentence-level distant relation extraction via negative training)can introduce negative training ideas into the scene for relation extraction tasks.However,in the baseline model,BiLSTM(bi-directional long short-term memory)is used to extract features,mainly focusing on learning long-distance dependencies.There are shortcomings in focusing on features in local contexts,and local features are not captured adequately.The baseline model fails to focus on the features related to complementary labels during negative training,and the learning of complementary label information is insufficient,which affect the recognition ability of noisy data.On this basis,the convolutional neural network is introduced,and the convolutional operation is conducted on the input relation sequence by means of convolutional kernel,so as to capture local information in the input relation instance and improve the model's local feature learn ability of the input data.In allusion to the problem that the model fails to pay attention to complementary label features,a reverse attention mechanism is introduced.By adjusting the weights of hidden units related to complementary labels,the model can selectively focus on information related to complementary labels,thereby improving the recognition performance of the model on noisy data based on complementary labels and further improving the relation extraction performance of the model.The designed method is verified by means of the NYT10 dataset,and the results show that,in comparison with the baseline model,the F1 value of the proposed method in the NYT10 dataset relation extraction task is increased by 4.84%,which can effectively improve the model's ability of the distantly supervised relation extraction.
作者
赵明
刘胜全
岳柳
ZHAO Ming;LIU Shengquan;YUE Liu(College of Software,Xinjiang University,Urumqi 830000,China;College of Computer Science and Technology,Xinjiang University,Urumqi 830000,China)
出处
《现代电子技术》
北大核心
2024年第16期51-57,共7页
Modern Electronics Technique
基金
新疆维吾尔自治区重大科技项目(2022A02012-1)
国家自然科学基金项目(61966034)
国家重点研发计划项目(2022ZD0115801)
关键词
远程监督
关系抽取
基线模型SENT
负训练
注意力机制
互补标签
深度学习
distant supervision
relation extraction
baseline model SENT
negative training
attention mechanism
complementary label
deep learning