期刊文献+

大图中全部极大团的并行挖掘算法研究 被引量:2

An Efficient Parallel Algorithm for Maximal Clique Enumeration in a Large Graph
下载PDF
导出
摘要 该文的目的在于优化现有的大图数据中全部极大团挖掘算法.在生物网络、社会网络及web分析中,找出图中的全部极大团是一个重要的应用.随着图数据规模的增大,传统的极大团挖掘算法因无法满足性能要求而被并行处理方式取代.但是,在现有的并行处理方法中,需要过滤大量的重复极大团和检测非极大团,降低了算法的性能.论文在分析了现有的极大团并行算法后,提出了新的大图中全部极大团挖掘算法.首先,使用顶点的偏序关系消除了冗余极大团以及非极大团的产生;第二,根据两个极大团之间至少存在一对无边的顶点的特征,提出了多颜色顶点涂色分片算法,将大图的顶点分为全色和半色两个集合;第三,证明了涂色分片算法是NP完全问题以及有一个多项式时间的2近似算法,并给出了近似算法;第四,基于多色顶点分片实现了一个并行的全部极大团挖掘算法,该算法只对全色顶点与它的邻接顶点组成重叠子图进行极大团挖掘;最后,对算法的性能以及加速比特性进行了评价,得出该算法能够处理百万个节点的大图并且性能比现有的算法有较大提高的实验结果. A maximal clique is perhaps the most fundamental dense substructure in a graph.Maximal clique enumeration is an important tool to discover densely connected sub graphs,with numerous applications on web graphs,social networks,and biological networks.Existing methods for finding all maximal cliques in a large graph based on distributed computing environment are,in our opinion,not quite efficient.We propose a better method using a multi-color partition algorithm to decompose large graph into lots of overlap sub graphs for finding all cliques in parallel.In the full paper,we explain our enumerating all maximal cliques method in detail.In this abstract,we just add some pertinent remarks to list the four topics of explanation.The first topic is:we use the heuristics in order to decrease the vertex search 1{space.In this topic,we discuss the two proposes of the algorithm:using vertex order to remove non-maximal cliques and duplicates,parallel policy to fit in distributed computing environment.The second topic is:we propose multiple color method to divide the vertexes of graph into two sets,which are related vertexes set and irrelevant vertex set.The two sets stained by full colors and semi colors respectively.Full colored vertex and its adjacent vertexes compose an overlap sub graph.On the other hand,semi colored vertexes is used to assure no lost of cliques.The third topic is a problem of the minimum full colors set.We first designed an approximation algorithm.Next we proved that this problem is NP complete.Then we proved this algorithm have an 2approximation algorithm for polynomial time.The fourth topic is:aparallel algorithm for enumerating all maximal cliques based on full colors set is proposed.In this topic,we use full colored vertexes and its related semi colored vertexes to split input graph to overlapped sub graphs.The full colored vertex acts as a pivot to enumerate maximal cliques over an overlap sub graph in parallel.Finally,we do experiments to appraise our method.Experimental evaluation on real datasets from various domains show that the response time of our method based on multi-color algorithm is more efficient comparing with the existing methods.The experiments on cluster of PCs also show that the algorithm can effectively process a variety of large real-world graphs with millions of vertices and the scalability of the multi-color partition algorithm is very well.
作者 汤小春 周佳文 田凯飞 李战怀 TANG Xiao-Chun;ZHOU Jia-Wen;TIAN Kai-Fei;LI Zhan-Huai(School of Computer Science,Northwestern Polytechnical University,Xi′an 710129)
出处 《计算机学报》 EI CSCD 北大核心 2019年第3期513-531,共19页 Chinese Journal of Computers
基金 中国科技部国家重点研发计划资助项目(2018YFB1003403)资助~~
关键词 图挖掘 极大团 涂色分片 并行算法 重叠子图 graph mining maximal clique enumeration multi-color partition parallel algorithm overlap sub graph
  • 相关文献

同被引文献4

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部