摘要
分布式并行计算是提高计算机性能常用的方法,但针对不同需求,并行程序的设计并没有统一的模型与方法,使得并行程序的编写完全依靠开发人员的经验。Google公司提出的分布式并行编程模型MapReduce能够完成特定类型的并行程序的开发与运行。使用哈希表对MapReduce分布式并行编程模型进行优化,减少中间结果中的碎片,并省略Combiner中间函数的调用,减少传输负载,提升运行效率,同时兼顾了Map函数与Reduce函数接口的属性,保持了MapReduce模型的并行性特点。
Distributed parallel computing is commonly used to improve computer performance. But according to different demands, there is not a uniform way to design and implement parallel program. Parallel programming depends on the experience of developer. MapReduce, a distributed parallel programming model, put forward by Google, can perform special parallel program development and operation. MapReduce was optimized by using Hash table, which would decrease fragment of Map function, skip other redundancy function such as Combiner function, reduce transmission load and improve computing efficiency. Meanwhile, the attributes of Map function and Reduce function were kept to make MapReduce maintaining parallel.
出处
《山东大学学报(理学版)》
CAS
CSCD
北大核心
2015年第7期66-70,共5页
Journal of Shandong University(Natural Science)
基金
国家自然科学基金青年基金资助项目(61303209)
六安市定向委托皖西学院市级研究项目(2013LWA004)
安徽省教育厅重点项目(KJ2013A255)