期刊文献+

动态云平台下的快速闭树聚类并行算法 被引量:2

Fast Closed Tree Clustering Parallel Algorithm for Dynamic Cloud Platform
下载PDF
导出
摘要 为提高聚类算法效率,提出一种基于动态云平台的快速闭树聚类并行算法。针对云计算平台Hadoop中任务的随机分配策略,给出一个满足最小化消耗成本的任务分配算法CDA-GA,并基于该算法提出动态云平台模型。将传统的频繁闭树挖掘算法与聚类算法并行化,应用于动态云平台中,设计基于动态云平台的闭树聚类算法框架。实验结果表明,该算法有效可行,适合在大规模数据下进行聚类分析。 In order to improve the efficiency of clustering algorithm, this paper proposes a model of fast closed tree paralleled algorithm on the platform of dynamic cloud. Aiming at the random allocation strategy of cloud computing platform Hadoop, the paper puts forward CDA-GA to meet the requirements of the minimized consumption cost. Moreover, on the foundation of CDA-GA, it proposes the dynamic cloud platform model. The parallelization of traditional frequency closed tree mining algorithm and clustering algorithm and is applied in the dynamic cloud platform, this paper designs a closed tree clustering algorithm framework. Experimental results show that the algorithm is feasible and fits into clustering analysis under massive amounts of data.
出处 《计算机工程》 CAS CSCD 2013年第9期80-83,共4页 Computer Engineering
基金 湖南省教育厅基金资助一般项目(10C1100) 吉首大学校级科研计划基金资助项目(11JD051)
关键词 数据挖掘 云计算 并行计算 闭树 树聚类 海量数据 data mining cloud computing parallel computing closed tree tree clustering mass data
  • 相关文献

参考文献13

  • 1Valiant L G. A Bridging Model for Parallel Computation[J]. Communications of the ACM, 1990, 33(3): 103-111.
  • 2Jeffrey D. MapReduce: Simplified Data Processing on Large Clusters[J]. Communications of the ACM, 2008, 51(1): 107-113.
  • 3Grzegorz M, Austern M H, Bik A J C, et al. Pregel: A System for Large-scale Graph Processing[C]//Proc. of SIGMOD'10. Indianapolis, USA: [s. n.], 2010: 135-145.
  • 4Avery C. Giraph: Large-scale Graph Processing Infrastruction on Hadoop[C]//Proceedings of Hadoop Summit. Santa Clara, USA: [s. n.], 2011.
  • 5Tyson C, Nell C, Peter A, et al. MapReduce Online[C]// Proceedings of NSDI' 10. San Jose, USA: [s. n.], 2010: 33-48.
  • 6Lublin U The Workload on Parallel Supercomputers: Model- ing the Characteristics of RigidJobs[J]. Journal of Parallel and Distributed Computing, 2003, 63(20): 1105-1122.
  • 7卓月明.基于聚类技术的XML文件代表性结构获取[J].吉首大学学报(自然科学版),2011,32(6):55-58. 被引量:4
  • 8周建钦,何凌云.最优扩散的循环矩阵[J].吉首大学学报(自然科学版),2011,32(5):37-40. 被引量:13
  • 9刘文军,游兴中.一种改进的凝聚层次聚类法[J].吉首大学学报(自然科学版),2011,32(4):11-14. 被引量:10
  • 10吴扬扬,雷庆,陈锻生,YOKOTA Harou.一种从XML数据中发现关系信息的方法[J].软件学报,2008,19(6):1422-1427. 被引量:10

二级参考文献62

  • 1朱永泰,王晨,洪铭胜,汪卫,施伯乐.ESPM——频繁子树挖掘算法[J].计算机研究与发展,2004,41(10):1720-1727. 被引量:18
  • 2崔灵果,曹元大.SPN分组密码中最优扩散层的构造与验证[J].计算机应用,2005,25(4):856-858. 被引量:2
  • 3崔灵果,曹元大.一种SPN线性层的设计方法[J].计算机工程,2005,31(20):8-9. 被引量:3
  • 4赵传申,孙志挥,张净.基于投影分支的快速频繁子树挖掘算法[J].计算机研究与发展,2006,43(3):456-462. 被引量:14
  • 5钱晓东.数据挖掘中分类方法综述[J].图书情报工作,2007,51(3):68-71. 被引量:28
  • 6ZAKI M J. Efficiently mining frequent trees in a forest: Algorithms and applications [ J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(8): 1021 - 1035.
  • 7AGGARWAL C C, TA N, WANG J, et al. XProj: A framework for projected structural clustering of XML documents [ C ]// SIGKDD'07: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2007:46-55.
  • 8DESHPANDE M, KURAMOCHI M, WALE N, et al. Frequent substructure-based approaches for classifying chemical compounds [J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(8) : 1036 - 1050.
  • 9HORVATH T, GARTNER T, WROBEL S. Cyclic pattern kernels for predictive graph mining [ C]// KDD 2004: Proceedings of the tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM Press, 2004:158 -167.
  • 10BROWN J W. The ribonuclease P database [ J]. Nucleic Acids Research, 1998, 26(1) : 351 -352.

共引文献64

同被引文献20

引证文献2

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部