摘要
关系分类是自然语言处理中一项重要的基础性任务,目的是识别实体对之间的语义关系。目前的方法主要依赖于句子特征,忽视了句子中实体的信息,而句子中的实体位置信息、实体类型信息以及实体依存信息等多元实体信息有助于识别实体间关系。为充分利用实体信息,提出一种融入多元实体信息关系分类模型BERT-MEI。首先标记实体类型和提取实体最短依存路径,然后通过预训练的语言表征(Bidirectional Encoder Representation from Transformers,BERT)模型编码,将编码后的句子向量、实体向量和实体依存关系向量合并为最终的实体关系表示。在KBP37数据集和TACRED数据集上的实验结果表明,BERT-MEI模型的F1值比基线模型提高了1~17百分点,验证了利用多元实体信息,能够提升关系分类的效果。
Relation classification is a key task in natural language processing,which aims to identify the semantic relationship between entity pairs.The present methods mainly rely on sentence features and ignore the information of entities in the sentence,while the multi-entity information,such as entity position information,entity type information and entity dependency information,can help to identify inter-entity relationships.In order to make full use of these information,a relation classification model BERT-MEI integrating multi-entity information relationship is proposed.First,the entity type is marked and the shortest entity dependency path is extracted.Then,the encoded sentence vector,entity vector and entity dependency vector are combined into the final entity relationship representation by BERT model coding.Experimental results on KBP37 dataset and TACRED dataset show that the F1 score of BERT-MEI model is 1~17 percentage point higher than that of baseline models,which verifies that multi-entity information can improve the effect of relation classification.
作者
胡红卫
刘晓楠
尹美娟
李玲玲
刘粉林
HU Hongwei;LIU Xiaonan;YIN Meijuan;LI Lingling;LIU Fenlin(Henan Province Key Laboratory of Cyberspace Situation Awareness, Zhengzhou 450001, China;State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China;Zhengzhou University of Aeronautics, Zhengzhou 450015, China)
出处
《信息工程大学学报》
2022年第1期51-57,共7页
Journal of Information Engineering University
基金
中原科技创新领军人才资助项目(214200510019)。
关键词
关系分类
BERT
实体信息
最短依存路径
relation classification
BERT
entity information
shortest dependency path