摘要
HPMR系统是一个采用MapReduce模型的高性能计算软件支撑平台,它改进了MapReduce模型以适应高性能计算的需求。高效的HPMR系统内存管理模块是保证其效率的重要模块。HPMR系统中有两个角色,Master和Worker。Master负责从输入数据文件中读入数据块并分配给Workers。Worker负责接收master分配的数据块、管理map函数的输入输出模块的输入输出数据、管理reduce函数输入输出数据。目前的内存管理模块存在管理通信冗余、管理低效、数据处理并行不足等缺点。本文根据成熟的内存优化理论,重新设计了HPMR底层的数据管理机制,提出了基于内存池的内存管理。实验表明,新的内存管理模块是保证HPMR系统高效的必要条件。
HPMR is a high performance computing platform based on MapReduce model. It has improved the MapReduce model to meet the need of high performance computing. Efficient memory management module ensures the efficiency of HPMR. There are two roles in HPMR, Master and worker. Master reads data blocks from the input data file and assign them to workers. Worker receives the data blocks from master, manages input and output module of the map and reduce function. The current memory management module in HPMR has some shortcomings: redundancy, inefficiency and lack of parallelism. This paper redesigned the underlying data management mechanism of HPMR based on mature memory optimization theory, proposed new memory management way based on memory pool. Experiments show that the new memory management module is necessary for efficient HPMR system.
出处
《计算机系统应用》
2011年第8期104-109,共6页
Computer Systems & Applications
基金
工信部核高基重大专项(2009X01028-002-003)
安徽省自然科学基金(090412068)