期刊文献+

自适应云端的大规模导出子图提取算法 被引量:7

Large Scale Induced Subgraphs Mining Algorithm on Self Adaptive Cloud
下载PDF
导出
摘要 针对现有云计算平台资源随机调配与传统导出子图挖掘效率较低等问题,进一步提升云计算平台中资源整合利用效率与大规模导出子图挖掘效率,提出了一种自适应云端的大规模导出子图提取算法,以解决资源优化利用与海量图挖掘等问题。首先介绍了云计算概念与导出子图挖掘相关概念以及问题描述;接着根据MapReduce并行处理模型设计了一种自适应任务动态分配算法SAC_TA(Self Adaptive Cloud Dynamic Allocation),它根据计算任务自适用分配系统资源以达到成本消耗的最优;并设计出自适应云端框架,然后基于自适应云端提出了大规模导出子图挖掘算法SFGFF(SAC_TA、Find_VE、G_F1、FindPartFG、FindAllFG),它共分为4个阶段的挖掘,将所有算法应用到自适应云端中可构成整个导出子图挖掘体系;最后在人工模拟数据与真实环境数据下进行了试验,结果表明,自适应云端运行良好,算法有效可行,具有较高的加速比与运行效率,能有效满足大规模频繁导出子图挖掘的需求。 Aiming at the current puzzles of random resource allocation of cloud computing platform and lower mining efficiency of traditional induced subgraph,promoting the efficiency of resource integration and using of cloud computing platform and large-scale induced subgraph mining,the paper put forward an algorithm of large-scale induced subgraph extraction for self-adaption cloud to solve the problems of resource optimal utilization and massive graph mining.The paper firstly introduced the relevant concepts and problem description of cloud computing and induced subgraph mining,then designed an algorithm SAC_TA of self-adaption task dynamic allocation according to MapReduce parallel processing model,which can comput task self-adaption allocation system resources to reach the optimum of cost wasting,meanwhile designed the self-adaption cloud framework.On the basis of the framework,the paper put forward the massive induced subgraph mining algorithm SFGFF,which includs four stages of mining.And while applying all the algorithms to self-adaption cloud,the whole induced subgraph mining system can be constructed.The experimental result of manual simulation data and real environment data shows that the self-adaption cloud runs well and the algorithms are efficient and feasible,and have higher speed-up ratio and operating efficiency to satisfy the demand of massive frequent induced subgraph mining.
出处 《计算机科学》 CSCD 北大核心 2014年第6期155-160,198,共7页 Computer Science
基金 湖南省工业支撑计划重点项目(2012GK2006) 湖南省教育厅科学研究项目(12C0291 11C1051) 生态旅游湖南省重点实验室开放基金项目(JDSTLY201206) 湖南省图书馆学会2013-2014年度重点课题(XHZD1007)资助
关键词 大数据 数据挖掘 云计算 导出子图 子图同构 Big data Data mining Cloud computing Induced subgraph Subgraph isomorphism
  • 相关文献

参考文献19

  • 1覃雄派,王会举,杜小勇,王珊.大数据分析——RDBMS与MapReduce的竞争与共生[J].软件学报,2012,23(1):32-45. 被引量:386
  • 2TDWI Checklist Report:Big Data Analytics[OL].http://tdwi.org/research/2010/08 Big-Data-Analytics.aspx.
  • 3邹兆年,李建中,高宏,张硕.从不确定图中挖掘频繁子图模式[J].软件学报,2009,20(11):2965-2976. 被引量:32
  • 4Zou Xiao-hong,Chen Xiao,Guo Jing-feng,et al.An improved algorithm for mining Close Graph[J].ICIC Express Letters Journal of Research and Surveys,2010,4(4):1135-1140.
  • 5薛冰,张俊峰,郑超.基于分割图集的频繁闭图挖掘算法[J].计算机应用研究,2011,28(1):61-64. 被引量:3
  • 6Guo Jing-feng,Chai Ran,Li Jia.Top-down algorithm for mining maximal frequent subgraph[J].Advanced Research on Industry,Information System and Materials Engineering,2011,204-210:1472-1476.
  • 7刘勇,李建中,高宏.从图数据库中挖掘频繁跳跃模式[J].软件学报,2010,21(10):2477-2493. 被引量:10
  • 8Gupta S,Raman V,Saurabh S.Maximum r-Regular Induced Subgraph Problem:Fast Exponential Algorithms and Combinatorial Bounds[J].SIAM Journal on Discrete Mathematics,2012,26(4):1758-1780.
  • 9Lenk A,Klems M,Nimis J,et al.What's inside the cloud? An Architectural Map of the Cloud Landscape[C]//Proceedings of the 2009 ICSE Workshop on Software Engineering Challenges of Cloud Computing.2009:23-31.
  • 10Son J,Choi H,Chung Y D.Skew-tolerant key distribution for load balancing in MapReduce[J].IEICE Transaction on Information and Systems,2012,95 (2):677-680.

二级参考文献118

  • 1卢云彬,曹汉强.基于Hash表的关联规则挖掘算法的改进[J].计算机技术与发展,2007,17(6):12-14. 被引量:10
  • 2Inokuchi A,Washio T,Motoda H.An apriori-based algorithm for mining frequent substructures from graph data.In:Cheng M,Yu PS,Liu B,eds.Proc.of the 4th European Conf.on Principles of Data Mining and Knowledge Discovery.Lyon:Springer-Verlag,2000.13-23.
  • 3Kuramochi M,Karypis G.Frequent subgraph discovery.In:Cercone N,Lin TY,Wu X,eds.Proc.of the 1st IEEE Int'l Conf.on Data Mining.San Jose:IEEE Computer Society,2001.313-320.
  • 4Yan X,Han J.gSpan:Graph-Based substructure pattern mining.In:Aggrawal R,Dittrich K,Ngu AH,eds.Proc.of the 2nd IEEE Int'l Conf.on Data Mining.Maebashi:IEEE Computer Society,2002.721-724.
  • 5Borgelt C,Berhold MR.Mining molecular fragments:Finding relevant substructures of molecules.In:Aggrawal R,Dittrich K,Ngu AH,eds.Proc.of the 2nd IEEE Int'l Conf.on Data Mining.Maebashi:IEEE Computer Society,2002.51-58.
  • 6Huan J,Wang W,Prins J.Efficient mining of frequent subgraphs in the presence of isomorphism.In:Wu X,Tuzhilin A,eds.Proc.of the 3rd IEEE Int'l Conf.Data Mining.Melbourne:IEEE Computer Society,2003.549-552.
  • 7Nijssen S,Kok JN.A quickstart in frequent structure mining can make a difference.In:Kim W,Kohavi R,Gehrke J,DuMouchel W,eds.Proc.of the 10th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Seattle:ACM,2004.647-652.
  • 8Yan X,Han J.Closegraph:Mining closed frequent graph patterns.In:Getoor L,Senator TE,Domingos P,Faloutsos C,eds.Proc.of the 9th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Washington:ACM,2003.286-295.
  • 9Huan J,Wang W,Prins J,Yang J.Spin:Mining maximal frequent subgraphs from graph databases.In:Kim W,Kohavi R,Gehrke J,DuMouchel W,eds.Proc.of the 10th ACM SIGKDD Int'l Conf.on Knowledge Discovery and Data Mining.Seattle:ACM,2004.581-586.
  • 10Thomas LT,Valluri SR,Karlapalem K.Margin:Maximal frequent subgraph mining.In:Proc.of the 6th IEEE Int'l Conf.Data Mining.Hong Kong:IEEE Computer Society,2006.1097-1101.http://www.computer.org/portal/web/csdl/doi/10.1109/ ICDM.2006.102.

共引文献425

同被引文献71

  • 1ANCHURI P, ZAKI M J, BARKOL O, et al.Approximate graph mining with label costs[C].Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM, 2013 : 518-526.
  • 2KANG U,AKOGLU L, CHAU D H P.Big graph mining: algorithms, anomaly detection, and applications[J].Proceedings of the ACM ASONAM, 2013,13 : 25-28.
  • 3ZHU F ,QU Q ,LO D, et al.Mining top-k large structural patterus in a massive network[J].Proceedings of the VLDBEndowment, 2011,4( 11 ) : 807-818.
  • 4AKOGLU L, CHAU D H, KANG U, et al.Opavion : mining and visualization in large graphs[C].Proeeedings of the 2012 ACM SIGMOD International Conference on Management of Data.ACM, 2012 : 717-720.
  • 5SARMA A D, AFRATI F N, SALIHOGLU S, et al. Upper and lower hounds on the east of a map-reduce e:mputa- tion[C].Proceedings of the VLDB Endowment. VLDB Endowment, 2013,6(4) : 277-288.
  • 6BORGELT C, MEINL T, BERTHOLD M. Moss : a program for molecular substructure mining[C].Proeeedings of the 1st international workshop on open source data mining:frequent pattern mining implementations. ACM, 2005 : 6-15.
  • 7BORGELT C.Canonical forms for frequent graph mining[M] Advances in Data Analysis, Springer Berlin Heidelberg, 2007 : 337-349.
  • 8LESKOVEC J.Stanford large network dataset collection[J]. URL http ://snap.stanford.edu/data/index.html, 2011.
  • 9Anchuri P, Zaki M J, Barkol O, et al. Approximate graph mining with label costs[C]. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2013: 518-526.
  • 10Kang U, Akoglu L, Chau D H P. Big Graph Mining: Algorithms, Anomaly Detection, and Applications [J].Proceedings of the ACM ASONAM, 2013, 13: 25-28.

引证文献7

二级引证文献85

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部