摘要
随着谷歌知识图谱、DBpedia、微软Concept Graph、YAGO等众多知识图谱的不断出现,根据RDF来构建的知识表达体系越来越为人们所熟知.利用RDF三元组表达形式成为人们对现实世界中知识的基本描述方式,由于其结构简单、逻辑清晰,所以易于理解和实现,但也因为如此,当其面对现实中无比繁杂的知识和很多常识时,往往也无法做到对知识的认识面面俱到,知识图谱的构建过程注定会使其中包含的知识不具有完整性,即知识库无法包含全部的已知知识.此时知识库补全技术在应对此种情形时就显得尤为重要,任何现有的知识图谱都需要通过补全来不断完善知识本身,甚至可以推理出新的知识.本文从知识图谱构建过程出发,将知识图谱补全问题分为概念补全和实例补全两个层次:(1)概念补全层次主要针对实体类型补全问题,按照基于描述逻辑的逻辑推理机制、基于传统机器学习的类型推理机制和基于表示学习的类型推理机制等3个发展阶段展开描述;(2)实例补全层次又可以分为RDF三元组补全和新实例发现两个方面,本文主要针对RDF三元组补全问题沿着统计关系学习、基于随机游走的概率学习和知识表示学习等发展阶段来阐述实体补全或关系补全的方法.通过对以上大规模知识图谱补全技术研究历程、发展现状和最新进展的回顾与探讨,最后提出了未来该技术需要应对的挑战和相关方向的发展前景.
With the continued growth of various knowledge graphs, such as Google Knowledge Map, DBpedia,Microsoft Concept Graph, and YAGO, the knowledge representation system, constructed based on RDF, has become more well-known. The RDF triple format has become the basic description of knowledge in the real world. Due to its simple structure and clear logic, it is easy to understand and implement. Nevertheless, when faced with extremely complicated knowledge and common sense, complete knowledge can become difficult to describe. The construction process of knowledge graphs is bound to lead to incomplete knowledge contained in the graphs. At this point, the knowledge-based completion technology is particularly important for managing such situations. Any existing knowledge graph must be improved continuously through completion technology and newly inferred knowledge. Beginning with the construction of a knowledge graph, this paper divides the problem of knowledge graph completion into two levels: concept completion and instance completion.(1) The concept completion level primarily focuses on the completion of entity types. It is described in terms of three development stages: a logical reasoning mechanism, based on description logic, a type inference mechanism, based on traditional machine learning, and a type inference mechanism, based on representation learning.(2) The instance completion level can be further divided into an RDF triple completion and new instance discovery. This paper focuses on RDF triples completion learning, which includes entity completion or relationship completion and is described in three development stages, such as statistical relational learning, probability learning based on random walks, and knowledge representation learning. Through the review and discussion of the research process, the development status, and the latest progress in the above-mentioned large-scale knowledge graph completion, we present the challenges that the technology will face and the development prospects of future work.
作者
王硕
杜志娟
孟小峰
Shuo WANG;Zhijuan DU;Xiaofeng MENG(Information School,Renmin University of China,Beijing 100872,China;Key Laboratory of Machine Learning and Computational Intelligence,Hebei University,Baoding 071002,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2020年第4期551-575,共25页
Scientia Sinica(Informationis)
基金
国家自然科学基金(批准号:61532010,61532016,91846204,91646203,61762082)
国家重点研发计划(批准号:2016YFB1000602,2016YFB1000603)
中国人民大学科学研究基金(批准号:11XNL010)
河南省科技开放合作(批准号:172106000077)资助项目。
关键词
知识图谱
知识库补全
概念补全
实例补全
knowledge graph
knowledge base completion
concept completion
instance completion