期刊文献+

图稀疏化:加速图聚类的有效方法 被引量:3

Graph sparsification:Effective way to accelerate graph clustering
下载PDF
导出
摘要 为保证在不牺牲精度的前提下加快大规模图聚类速度,将稀疏化思想引入图聚类中,在大图聚类之前增加一个稀疏化图的环节,稀疏化之后的图能够很好地保持原始图中各类结构,可实现在更小规模数据集上进行图聚类以提高运行速度。针对DBLP数据集构成的图,分别在原始图和稀疏化图上使用k-medoids图聚类算法,比较其运行时间和聚类精度,实验结果表明,在稀疏化图上进行聚类,可大大缩短运行时间,聚类精度并没有降低,实验分类情况和实际情况相吻合,取得了很好的聚类效果。 To accelerate the speed of large-scale graph clustering without sacrificing accuracy, the idea of sparsification is introduced to graph clustering, a graph sparsification link is added before large scale graph clustering. The cluster structure of the original graph is preserved very well in the sparsified graph, so that it can realize graph clustering on a smaller scale dataset to improve speed. With example, K-medoids graph clustering algorithm is applied to partition the original graph extracted from the DBLP data and the sparsified graph respectively, the running time is compared with clustering accuracy. Experimental results demonstrate that in the sparsified graph clustering, the running time is shortened without sacrificing quality, as well as experi- mental clustering is consistent with the actual situation, good effect of clustering is achieved.
出处 《计算机工程与设计》 CSCD 北大核心 2013年第11期3934-3938,共5页 Computer Engineering and Design
基金 国家自然科学基金项目(51105077) 广东省自然科学基金项目(10152800001000029)
关键词 图聚类 图稀疏化 采样 相似度 k-medoids算法 graph clustering graph sparsification sampling similarity k-medoids algorithm
  • 相关文献

参考文献10

  • 1Nicolds Garcia Pedrajas, Aida de Haro-Garcia. Scaling up data mining algorithms: Review and taxonomy [J]. Progress in Artificial Intelligence, 2012, 1 (1): 71-87.
  • 2Andrew J Maida, John Harlirr- Filtering complex turbulent systems [M]. Cambridge: Cambridge University Press, 2012.
  • 3Glattfelder J B, Battiston S. Backbone of complex networks of corporations: The flow of control [J]. Phys Roy E, 2009, 80 (3) : 36104.
  • 4Satu Elisa Schaeffer. Scalable uniform graph sampling by local computation[J]. SIAM Journal on Scientific Computing, 2010, 32 (5):2937-2963.
  • 5Sanjeev Arora, Elad Hazan, Satyen Kale. O(√logn) approximation to SPARSEST CUT in O^-(n^2 ) time [J]. SIAM Journal on Scientfic Computing, 2010, 39 (5): 1748-1771.
  • 6Daniel A Spielman, Nikhil Srivastava. Graph sparisfication by effective resistances [C] //Victoria, British Columbia, Canada: STOC, 2008: 563-568.
  • 7Burno Ribeiro, Don Towsley. Estimating and sampling graphs with multidimensional random walks [C]//Melbourne, Australia: Internet Measurement Conference, 2010: 390-403.
  • 8Arun S Maiya, Tanya Y Berger-Wolf. Sampling community structure [C]//Raleigh, North Carolina, USA: WWW, 2010: 701-710.
  • 9Choi Seung-Seok, Cha Sung-Hyuk, Charles C Tappert. A survey of binary similarity and distance measures [J]. Systemics, Cybernetics and Informaties, 2010, 8 (1): 43-48.
  • 10温菊屏,钟勇.图聚类的算法及其在社会关系网络中的应用[J].计算机应用与软件,2012,29(2):161-163. 被引量:16

二级参考文献11

  • 1Venu Satuluri, Srinivasan Parthasarathy. Scalable graph clustering using stochastic flows : applications to community discovery [ C ]//KDD, Par- is, France ,2009:737 - 746.
  • 2Dongen S V. Graph Clustering by Flow Simulatiion [ D ]. Utrecht: Uni- versity of Utrecht,2000.
  • 3Shi J, Malik J. Normalized Cuts and Image Segmentation [ J ]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000,22 (8) :888 -905.
  • 4Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, et al. SCAN: a structural clustering algorithm for networks [ C]//KDD. San Jose, CA, USA, 2007 : 824 - 833.
  • 5Scott J. Social Network Analysis:A Handbook [M ]. 2nd ed. London: Sage Publications Ltd,2000.
  • 6Michelle Girvan,Newman M E J. Community structure in social and bi- ological networks [ J ]. PNAS. 2002,99 ( 12 ) :7821 - 7826.
  • 7Newman M E J. Fast algorithm for detecting community structure in net- works[J], phys REVE,2004,69(6) :066133.
  • 8Filippo Radicchi, Claudio Castellano, Federico Cecconi. Defining and i- dentifying communities in networks [ C ]// Proceeding of Nail Acad. Sci ,2004,101 (9) :2658 - 2663.
  • 9palla G, Derenyi I, Farkas I. Uncovering the overlapping community structure of complex networks in nature and society[J]. Nature,2005, 435:814 -818.
  • 10郭春燕.基于连接度的图聚类方法研究[D].太原:山西大学,2008.

共引文献15

同被引文献27

  • 1江小平,李成华,向文,张新访,颜海涛.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报(自然科学版),2011,39(S1):120-124. 被引量:79
  • 2韩彦芳,施鹏飞.基于数字图像处理的表面缺损检测技术[J].测控技术,2005,24(9):15-18. 被引量:18
  • 3http://www.informatik.uni-trier.de/~ley/db/
  • 4Lin J, Schataz M. Design patterns for efficient graph algorithms in ma- preduce [ C ]//MLG,2010,22(3 ) :78 - 85.
  • 5Lv Qin, Josephson W,Wang Zhe, et al. Multi-probe LSH : efficient inde- xing for high-dimensional similarity search [ C ]//Pm of the 33 Int Conf on Very Large Data Bases ( VLDB' 07 ). Vienna Austria: VLDB Endowment ,2007,10 (2) :950 - 961.
  • 6Yang H C, Dasdan A, Hsiao R L, et al. Map-Reduce-Merge: Simplified relational data processing [ C ]//Proc of ACM SIGMOD International Conference on Management of Data, New York: ACM, 2007:1029 -1040.
  • 7Vrba Z, Halvorsen P, Griwodz C, et al. Kahn process networks are a flexible alternative to mapreduce [ C]//Pine of IEEE International Con- ference on High Performance Computing and Communications, Piscat- away : IEEE ,2009 : 154 - 162.
  • 8Sandholm T, Lai K. MapReduce optimization using regulated dynamic priofitization[ J ]. Performance Evaluation Review, 2009,37 ( 1 ) : 299 -310.
  • 9Liu Q,Todman T, Luk W, et al. Combining optimizations in automated low power design[ C]//Proc of Design, Automation&Test in Europe Conference&Exhibition, Piscataway: IEEE ,2010 : 1791 - 1796.
  • 10Chen Quan, Zhang Daqiang, Gao Mingi, et al. SAMR: A self-adaptive mapreduce scheduling algorithm in heterogeneous environment [ C ]// Proc of IEEE International Conference on Computer and Information Technology, Los Alamitos : IEEE computer society ,2010:2736 - 2743.

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部