摘要
当前未考虑语义知识的图谱划分方法会导致知识图谱划分后查询通信量增大、查询执行效率降低。鉴于常用查询语句中的语义知识可用于聚合关联度高的子图结构,且NI-LPA(Node Importance-Label Propagation Algorithm)具有支持多标签、时间复杂度低和划分质量高的特点,提出了基于查询语义和NI-LPA的知识图谱划分方法。该方法对常用SPARQL查询集进行语义分析,利用分析结果计算知识图谱中节点间的语义关联度,并将关联度与NI-LPA中代表结构特征的节点重要度相结合,从而得到节点间的传播力度,使重要节点与其语义相关度高的节点更易具有相同的标签。实验结果表明,相较于COPRA和NI-LPA算法,该方法不仅能减少边割率和通信量,而且能在保证冗余度较低的情况下有效地提升查询同区率。
The current knowledge graph partition method without considering semantic knowledge will increase query commu-nication volume and decrease query execution efficiency after knowledge graph partition.In view of the fact that the knowledge in common query statements can be used to aggregate the substructures with high semantic correlation,and the NI-LPA(Node Impor-tance-Label Propagation Algorithm)has the characteristics of low time complexity and good partition quality,a knowledge graph partitioning method based on query semantics and NI-LPA is proposed.In this method,the semantic analysis of common SPARQL query sets is carried out,and the semantic correlation between nodes in the knowledge graph is calculated by using the analysis re-sults,and the correlation is combined with the importance of nodes representing structural characteristics in NI-LPA,so as to ob-tain the propagation strength between nodes,making it easier for important nodes with high semantic relevance to have the same la-bel.The experimental results show that compared with COPRA and NI-LPA algorithms,this method can not only reduce the edge cut rate and communication volume,but also effectively improve the query same-part rate with low redundancy.
作者
徐航
刘宇
XU Hang;LIU Yu(School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430065;Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System,Wuhan University of Science and Technology,Wuhan 430065)
出处
《计算机与数字工程》
2024年第6期1727-1732,1738,共7页
Computer & Digital Engineering
基金
国家自然科学基金项目(编号:U1836118)资助。
关键词
知识图谱划分
多标签传播算法
语义
通信量
查询同区率
knowledge graph partition
multi-label propagation algorithm
semantic
communication volume
query same-part rate