期刊文献+

基于树结构的MapReduce模型 被引量:9

MapReduce Model Based on Tree Structure
下载PDF
导出
摘要 MapReduce是Google开发的一种并行分布式计算模型,已在搜索和处理海量数据领域得到了广泛的应用。此模型只适用于数据关联性弱、能够高度并行化的程序,未能处理数据关联性强的数据(比如树形结构)。文中详细讨论了MapReduce的实现机制,提出了一种基于树结构的MapReduce模型,它是基于一种聚类聚合的反复轮询过程,聚合时用<k1,k2,…,kn,value>代替传统的<k,value>,使模型更具有一般性。最后搭建Hadoop平台来处理XML结构的海量数据,并比对新旧两种模型的效率。实验结果表明,其执行速度明显比传统模型高效。 MapReduce is a parallel distributed computing model developed by Google,it is widely used in the area of searching and large date dealing.This model can be used to process data with weak correlation degree,but unable to deal with the data efficicently by making full use of the relationship among the data(such as a tree).It proposes a MapReduce model based on the tree structure,it is based on a process which is featured in repeated polling with clustering aggregation,usek1,k2,…,kn,value rather than k,value as usual when aggregation,make the model more general.Experimental results show the execution speed is significantly higher than the traditional model.
出处 《计算机技术与发展》 2011年第8期149-152,共4页 Computer Technology and Development
基金 云南省自然科学基金(2007F174M) 云南大学研究生科研课题资助项目(ynny200928)
关键词 树结构 MAPREDUCE XML HADOOP tree structure MapReduce XML Hadoop
  • 相关文献

参考文献12

二级参考文献36

  • 1李盛恩,王珊.封闭数据立方体技术研究[J].软件学报,2004,15(8):1165-1171. 被引量:25
  • 2张蓉.Web挖掘技术研究[J].计算机工程,2006,32(15):4-6. 被引量:21
  • 3卢锡城,王怀民,王戟.虚拟计算环境iVCE:概念与体系结构[J].中国科学(E辑),2006,36(10):1081-1099. 被引量:37
  • 4Gray J, Chaudhuri S, Bosworth A, et al. Data cube : a relational aggregation operator generalizing group-by, crosstab, and sub-totals [ J]. Data Mining and Knowledge Discovery, 1997,1 ( 1 ) :29-53.
  • 5Lakshmanan L V S, Pei J, Han J W. Quotient cubes:how to summarize the semantics of a data cube [ C ]//Proceedings of the 28th International Conference .on Very Large Data Bases. Hong Kong: [ s. n. ] ,2002:778-789.
  • 6Lakshmanan L V S, Pei J, Zhao Y. QC-trees:an efficient summary structure for semantic OLAP [ C ]//Proceedings of ACM SIGMOD International Conference on Management of Data. San Diego:ACM,2003:64-75.
  • 7Beyer K, Ramakrishnan R. Bottom-up computation of sparse and iceberg CUBEs [C] //Proceedings of ACM SIGMOD International Conference on Management of Data. New York:ACM, 1999:359-370.
  • 8Xin D,Shao Z,Han J W,et al. C-Cubing:efficient computation of closed cubes by aggregation-based checking [ C ]// Proceedings of the 22nd International Conference on Data Engineering. Atlanta : IEEE, 2006:4 -4.
  • 9Chen Y, Dehne F, Eavis T. Parallel ROLAP data cube construction on shared-nothing muhiprocessors [ J ]. Distributed and Parallel Databases ,2004,15 ( 3 ) :219-236.
  • 10Sarawagi S, Agrawal R, Gupta A. On computing the data cube [R]. San Jose: IBM Almaden Research Center, 1996.

共引文献139

同被引文献75

  • 1蒋良孝,蔡之华,刘钊.一种基于信息增益的分类规则挖掘算法[J].中南大学学报(自然科学版),2003,34(z1):69-71. 被引量:8
  • 2王鹏.走进云计算[M].北京:人民邮电出版社,2009.
  • 3康塔尼克 闪四清译.数据挖掘:概念、模型、方法和算法[M].北京:清华大学出版社,2003..
  • 4DeanJ,GhemawatS.MapReduce:SimplifiedDataProcessingonLargeClusters[J].CommunicationsoftheACM,2008,51f11:107-113.
  • 5SachaK.Middlewarearchitecturewithpatternsandframeworks[z].2007.
  • 6刘鹏.云计算[M].2版北京:电子工业出版社,2011.
  • 7韩伟.基于md0叩云计算平台下DDoS攻击防御研究[D].太原:太原科技大学,2011.
  • 8张欣晨,杨庚.Hadoop环境中基于属性和定长密文的访问控制方法[J/0L].计算机工程与应用.http://www.cnki.net/kcma/doi/10.3778/j.issn. 1002 - 8331. 1311 - 0372. html, 2014 - 04-03.
  • 9李克然.基于云计算的电子商务数据管理模式研究[D].西安:西安电子科技大学,2011.
  • 10霍树民.基于Hsdoop的海量影像数据管理关键技术研究[D].长沙:国防科学技术大学,2010.

引证文献9

二级引证文献32

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部