期刊文献+

基于对比学习与语言模型增强嵌入的知识图谱补全

Knowledge Graph Completion Based on Contrastive Learning and Language Model-Enhanced Embedding
下载PDF
导出
摘要 知识图谱是由各种知识或数据单元经过抽取等处理而组成的一种结构化知识库,用于描述和表示实体、概念、事实和关系等信息。自然语言处理技术的限制和各种知识或信息单元文本本身的噪声都会使信息抽取的准确性受到一定程度的影响。现有的知识图谱补全方法通常只考虑单一结构信息或者文本语义信息,忽略了整个知识图谱中同时存在的结构信息与文本语义信息。针对此问题,提出一种基于语言模型增强嵌入与对比学习的知识图谱补全(KGC)模型。将输入的实体和关系通过预训练语言模型获取实体和关系的文本语义信息,利用翻译模型的距离打分函数捕获知识图谱中的结构信息,使用2种用于对比学习的负采样方法融合对比学习来训练模型以提高模型对正负样本的表征能力。实验结果表明,与基于来自Transformеr的双向编码器表示的知识图谱补全(KG-BERT)模型相比,在WN18RR和FB15K-237数据集上该模型链接预测的排名小于等于10的三元组的平均占比(Hits@10)分别提升了31%和23%,明显优于对比模型。 A knowledge graph is a structured knowledge base comprising various types of knowledge or data units obtained through extraction and other processes.It is used to describe and represent information,such as entities,concepts,facts,and relationships.The limitations of Natural Language Processing(NLP)technology and the presence of noise in the texts of various knowledge or information units affect the accuracy of information extraction.Existing Knowledge Graph Completion(KGC)methods typically account for only single structural information or text semantic information,whereas the structural and text semantic information in the entire knowledge graph is disregarded.Hence,a KGC model based on contrastive learning and language model-enhanced embedding is proposed.The input entities and relationships are obtained using a pretrained language model to obtain the textual semantic information of the entities and relationships.The distance scoring function of the translation model is used to capture the structured information in the knowledge graph.Two negative sampling methods for contrastive learning are used to fuse contrastive learning to train the model to improve its ability to represent positive and negative samples.Experimental results show that compared with the Bidirectional Encoder Representations from Transformers for Knowledge Graph completion(KG-BERT)model,this model improves the average proportion of triple with ranking less than or equal to 10(Hits@10)indicator by 31%and 23%on the WN18RR and FB15K-237 datasets,respectively,thus demonstrating its superiority over other similar models.
作者 张洪程 李林育 杨莉 伞晨峻 尹春林 颜冰 于虹 张璇 ZHANG Hongchen;LI Linyu;YANG Li;SAN Chenjun;YIN Chunlin;YAN Bing;YU Hong;ZHANG Xuan(Policy Research and Enterprise Management Department,Yunnan Power Grid Co.,Ltd.,Kunming 650032,Yunnan,China;School of Software,Yunnan University,Kunming 650091,Yunnan,China;Electric Power Research Institute,Yunnan Power Grid Co.,Ltd.,Kunming 650217,Yunnan,China;Key Laboratory of Software Engineering of Yunnan Province,Kunming 650091,Yunnan,China;Engineering Research Center of Cyberspace,Kunming 650091,Yunnan,China)
出处 《计算机工程》 CAS CSCD 北大核心 2024年第4期168-176,共9页 Computer Engineering
基金 国家自然科学基金(61862063,61502413,61262025) 云南电网有限责任公司创新项目(YNKJXM20222254) 云南省中青年学术和技术带头人后备人才项目(202205AC160040) 云南省院士专家工作站项目(202205AF150006) 云南省科技计划重大专项计划项目(202202AE090066) 云南省教育厅科学研究基金(2023Y0256) 云南大学软件学院“知识驱动智能软件工程科研创新团队”项目。
关键词 知识图谱补全 知识图谱 对比学习 预训练语言模型 链接预测 Knowledge Graph Completion(KGC) knowledge graph contrastive learning pretrained language model link prediction
  • 相关文献

参考文献6

二级参考文献23

共引文献74

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部