摘要
随着高分辨率数据获取技术的发展,地理栅格数据的数据量不断增大,串行计算方式无法快速处理大型栅格数据,因此需要通过并行技术提高效率。传统开发过程将算法与进程调度、内存管理以及数据I/O混杂在一起的编程方式,对程序员要求较高,代码质量不易控制。提出了一种面向大型地理栅格数据的并行处理框架,利用核心类的真实和虚拟两种读取方式,实现了大型数据的分步骤、分块的快速加载和写入,并将所有的并行任务调度、进程间的数据传输过程以及特定的栅格算法步骤归结为任务;通过该框架可以将算法本身与并行调度、磁盘I/O等底层操作分离,使算法编写者可以专注于算法本身,降低开发难度,提高代码质量,解决了快速编写大型地理栅格数据算法程序的目的。实验表明,本框架可实现较好的并行效果,并显著降低代码量、提高软件质量。
With the advance of technology, geographic raster datag amount increases continuouslly. Single process cannot process large raster data efficiency, so it is necessary to adopt parallel processing. Traditional development method mixes algorithm, processes scheduling, memory management and data I/O together, thus it presents higher requirements for programmers and the code quality is difficult to control. This study proposes a Huge Geographic Raster Data Parallel Processing Framework (HGRDPPF). With the use of core class's real read and virtual read method, framework can achieve a large raster data's fast loading and writing by steps or blocks, and can achieve parallel task scheduling, data transfer and specific algorithm stage into tasks; through this framework, the raster file is split into sub-tasks according to the ability of computer in the cluster, and separate the raster processing algorithm from MPI API, disk I0 and logic, developers can concentrate onto the algorithm itself, and achieve higher program quality. Experiments show that this framework can significantly reduce the amount of code while improving software quality, and to achieve a better parallel performance.
出处
《国防科技大学学报》
EI
CAS
CSCD
北大核心
2013年第6期152-156,共5页
Journal of National University of Defense Technology
基金
国家自然科学基金资助项目(401101384)
国家863计划项目(2011AA120302)