摘要
图聚集技术是将一个大规模图用简洁的小规模图来表示,同时保留原始图的结构和属性信息的技术。现有算法未同时考虑节点的属性信息与边的权重信息,导致图聚集后与原始图存在较大差异。因此,提出一种同时考虑节点属性信息与边权重信息的图聚集算法,使得聚集图既保留了节点属性相似度又保留了边权重信息。该算法首先定义了闭邻域结构相似度,通过一种剪枝策略来计算节点之间的结构相似度;其次使用最小哈希(MinHash)技术计算节点之间的属性相似度,并调节结构相似与属性相似所占的比例;最后,根据2方面相似度的大小对加权图进行聚集。实验表明了该算法可行且有效。
Graph aggregation is a technology for representing a large scale graph with a concise graph that can preserve the structural and attribute information of the original large graph.Existing algorithms consider either the attribute information of nodes or the weight information of edges,and the difference between the original graph and the aggregated graph can thus be huge.So we propose a graph aggregation method considering both the attribute information of nodes and the weight information of edges,which enables the aggregated graph not only to preserve the similarity of node attributes but also edge weight information.Firstly,we define the closed neighborhood structural similarity,and use a structure pruning strategy to calculate the structural similarity between nodes.Secondly,minimum hash(Minhash)technique is employed to calculate the attribute similarity between nodes,and the proportions of structure similarity and attribute similarity are adjusted,based on which the weighted graph is aggregated.Experiments prove the feasibility and effectiveness of our method.
作者
邴睿
马慧芳
刘宇航
余丽
BING Rui;MA Hui-fang;LIU Yu-hang;YU Li(College of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070;Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 541004;Guangxi Key Laboratory of Multi-Source Information Mining&Security,Guangxi Normal University,Guilin 541004,China)
出处
《计算机工程与科学》
CSCD
北大核心
2019年第10期1777-1784,共8页
Computer Engineering & Science
基金
国家自然科学基金(61762078,61363058)
广西可信软件重点实验室研究课题(kx201910)
广西多源信息挖掘与安全重点实验室开放基金(MIMS18-08)
关键词
图聚集
结构相似度
属性相似度
加权图
最小哈希
graph aggregation
structural similarity
attribute similarity
weighted graph
minimum hash(Minhash)