摘要
知识图谱表示学习旨在将实体和关系映射到一个低维稠密的向量空间中。现有的大多数相关模型更注重于学习三元组的结构特征,忽略了三元组内的实体关系的语义信息特征和三元组外的实体描述信息特征,因此知识表达能力较差。针对以上问题,提出了一种融合多源信息的知识表示学习模型BAGAT。首先,结合知识图谱特征来构造三元组实体目标节点和邻居节点,并使用图注意力网络(GAT)聚合三元组结构的语义信息表示;然后,使用BERT词向量模型对实体描述信息进行嵌入表示;最后,将两种表示方法映射到同一个向量空间中进行联合知识表示学习。实验结果表明,BAGAT性能较其他模型有较大提升,在公共数据集FB15K-237链接预测任务的Hits@1与Hits@10指标上,与翻译模型TransE相比分别提升了25.9个百分点和22.0个百分点,与图神经网络模型KBGAT相比分别提升了1.8个百分点和3.5个百分点。可见,融合实体描述信息和三元组结构语义信息的多源信息表示方法可以获得更强的表示学习能力。
Knowledge graph representation learning aims to map entities and relations into a low-dimensional dense vector space. Most existing related models pay more attention to learn the structural features of the triples while ignoring the semantic information features of the entity relationships within the triples and the entity description information features outside the triples,so that the abilities of knowledge expression of these models are poor. In response to the above problem,a knowledge representation learning method BAGAT(knowledge representation learning based on BERT model And Graph Attention Network)was proposed by fusing multi-source information. First,the entity target nodes and neighbor nodes of the triples were constructed by combining knowledge graph features,and Graph Attention Network(GAT)was used to aggregate the semantic information representation of the triple structure. Then,the Bidirectional Encoder Representations from Transformers(BERT)word vector model was used to perform the embedded representation of entity description information.Finally,the both representation methods were mapped to the same vector space for joint knowledge representation learning.Experimental results show that BAGAT has a large improvement compared to other models. Among the indicators Hits@1 and Hits@10 on the public dataset FB15K-237,compared with the translation model TransE(Translating Embeddings),BAGAT is increased by 25. 9 percentage points and 22. 0 percentage points respectively,and compared with the graph neural network model KBGAT(Learning attention-based embeddings for relation prediction in knowledge graphs),BAGAT is increased by 1. 8 percentage points and 3. 5 percentage points respectively,indicating that the multi-source information representation method incorporating entity description information and semantic information of the triple structure can obtain stronger representation learning capability.
作者
焦守龙
段友祥
孙歧峰
庄子浩
孙琛皓
JIAO Shoulong;DUAN Youxiang;SUN Qifeng;ZHUANG Zihao;SUN Chenhao(College of Computer Science and Technology,China University of Petroleum,Qingdao Shandong 266555,China)
出处
《计算机应用》
CSCD
北大核心
2022年第4期1050-1056,共7页
journal of Computer Applications
基金
中央高校基本科研业务费专项资金资助项目(20CX05017A)
中石油重大科技项目(ZD2019-183-006)。
关键词
知识图谱
知识表示学习
图注意力网络
BERT
多源信息融合
knowledge graph
knowledge representation learning
Graph Attention Network(GAT)
Bidirectional Encoder Representations from Transformers(BERT)
multi-source information fusion