MapReduce在科学计算中的研究与改进

The research and improvement of MapReduce in scientific computing

下载PDF

导出

摘要针对Haloop模型不能实现各个计算节点的通信和Twister模型出现大量的数据重叠,提出了以下的改进:在Hadoop模型中增加各个节点的通信机制和缓冲机制。具体的实施如下:首先,在Map函数中引入了一个参数M来区分科学计算中的四类算法;其次,并将经常用的函数封装成适配器;再者,静态数据声明成保护类型并存放在缓冲池中。在文章的最后利用Hadoop做的相关实验,实验表明:随着计算节点数的增多,其加速比是越来越大的。 Against the problem of a large numberdata coverage in Twister and not communicating with different computing nodes, Made the following improvements： to increase each node communication mechanism and buffering mechanism in the Hadoop .The specific embodiment is as follows：First of all, this paper introduced the parameter of M in the Map function in order to distinguish four categories algorithms of scientific computing . Secondly, functions which are frequently used were packaged into the adapter , At the same time, the static data was decared as the type of protection in order to protect data safety. Finally, this paper cited the examples in the last of the paper and did a few of associated experiments. The experiment showed that with the increase of the number of computing nodes, the speedup is growing.

作者刘锋周飞凤

机构地区安徽大学计算机科学与技术学院

出处《无线互联科技》 2013年第3期113-114,共2页 Wireless Internet Technology

关键词 MapReduce技术科学计算 Map函数 Reduce函数 MapReduce technology scientific computing map function reduce function

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献2

1J.Dean,S.Ghemawat. MapReduce:Simplied Data Processing on Large Clusters[J].{H}Communications of the ACM,2008,(01).
2潘巍,李战怀,伍赛,陈群.基于消息传递机制的MapReduce图算法研究[J].计算机学报,2011,34(10):1768-1784. 被引量：45

二级参考文献33

1Dean J, Ghemawat S. MapReduce: Simplified dala processing on large clusters//Proceedings of the Conference on Operating System Design and Implementation(OSDU04,). San Francisco, USA, 2004: 137-150.
2Thusoo A, Sarma J S, JainN, Shao Z, Chakka P, Anthony S, Liu H, Wyckoff P, Murthy R. Hive: A warehousing solution over a map-reduce framework//Proceedings of the Conference on Very Large Databases (VLDB' 09). Lyon, France, 2009:1626-1629.
3Olston C, Reed B, Srivastava U, Kumar R, Tomkins A. Pig Latin: A not-so-foreign language for data processing//Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD' 08). Vancouver, BC, Canada, 2008:1099 1110.
4Bu Y, Howe B, Balazinska M, Ernst M D. HaLoop.. Efficient iterative data processing on large clusters//Proceedings of the Conference on Very Large Databases (VLDB' 10). Sin gapore, 2010:285-296.
5Ekanayake J, Li H, Zhang B, Gunarathne T, Bae S-H, Qiu J, Fox G. Twister: A runtime for iterative MapReduce// Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. Chicago, Illinois, USA, 2010:810-818.
6Wilson G V. Practical Parallel Programming. Cambridge, MA.. MIT Press, 1995.
7Valiant L G. A bridging model for parallel computation. Communications of the ACM, 1990, 33(8): 103-111.
8Dean J, Ghemawat S. MapReduce: A flexible data processing tool. Communications of the ACM, 2010, 53(1): 72-77.
9Pavlo A, Paulson E, Rasin A, Abadi D J, DeWitt D J, Mad den S, Stonebraker M. A comparison of approaches to large scale data//Proceedings of the 2009 ACM SIGMOD Interna tional Conference on Management of Data (SIGMOD' 09) New York, USA, 2009:165-178.
10Stonebraker M, Abadi D J, DeWitt D J, Madden S, Paulson E, Pavlo A, Rasin A. MapReduce and parallel DBMSs: Friends or foes? Communications of the ACM, 2010, 53(1) : 64-71.

共引文献44

1吕雪栋,冯志勇,王鑫,饶国政,付宇新.StepMatch:一种基于BSP计算模型的SPARQL基本图模式匹配算法[J].计算机研究与发展,2013,50(S2):94-102.
2梁秋实,吴一雷,封磊.基于MapReduce的微博用户搜索排名算法[J].计算机应用,2012,32(11):2989-2993. 被引量：12
3陈亮,东韩,徐凌宇,蔡茂,杜金峰.基于云平台的软件服务流的实现机制的研究[J].计算机工程与设计,2012,33(11):4196-4199. 被引量：2
4刘树仁,宋亚奇,朱永利,王德文.基于Hadoop的智能电网状态监测数据存储研究[J].计算机科学,2013,40(1):81-84. 被引量：51
5靳朋飞,曹菡,余婧,崔云飞.MapReduce模型下Voronoi图栅格生成算法[J].计算机科学与探索,2013,7(2):160-168. 被引量：2
6赵保学,李战怀,陈群,潘巍,姜涛,金健.基于共享的MapReduce多查询优化技术[J].计算机应用研究,2013,30(5):1405-1409. 被引量：7
7金健,陈群,赵保学.数据倾斜情况下基于MapReduce模型的连接算法研究[J].计算机与现代化,2013(5):22-27. 被引量：1
8黄伟建,周伟,蔡忠亚.海洋物质输运模拟中并行计算的应用比较研究[J].计算机工程与设计,2013,34(8):2929-2933. 被引量：1
9燕彩蓉,万永权.并行实体解析与记录聚合模型[J].小型微型计算机系统,2013,34(8):1843-1847. 被引量：1
10宋庆军.大数据条件下企业营销策略研究[J].统计与管理,2013(5):110-111. 被引量：8

1闫永刚,马廷淮,王建.KNN分类算法的MapReduce并行化实现[J].南京航空航天大学学报,2013,45(4):550-555. 被引量：21
2马军,李春燕.BSD中IPv6实现方式与传输性能的分析[J].计算机与现代化,2006(9):50-53.
3李杨,杨宝华,李双.BP-AdaBoost分类算法的MapReduce并行化实现[J].计算机应用与软件,2014,31(8):261-264. 被引量：1
4王晟,赵壁芳.云计算中MapReduce技术研究[J].通信技术,2011,44(12):159-161. 被引量：9
5方锦明.一种面向云计算的改进的Mapreduce模型[J].计算机测量与控制,2012,20(5):1417-1419. 被引量：4
6窦万春,江澄.大数据应用的技术体系及潜在问题[J].中兴通讯技术,2013,19(4):8-16. 被引量：37
7丁智,林治.MapReduce编程模型、方法及应用综述[J].电脑知识与技术,2014,10(10X):7060-7064. 被引量：3
8李锐,王斌.文本处理中的MapReduce技术[J].中文信息学报,2012,26(4):9-20. 被引量：18
9王志丹.基于云计算的属性重要度约简算法研究[J].洛阳师范学院学报,2014,33(8):64-66.
10钱进,苗夺谦,张泽华,张志飞.MapReduce框架下并行知识约简算法模型研究[J].计算机科学与探索,2013,7(1):35-45. 被引量：17

无线互联科技

2013年第3期

浏览历史

内容加载中请稍等...

MapReduce在科学计算中的研究与改进

参考文献2

二级参考文献33

共引文献44

相关作者

相关机构

相关主题

浏览历史