摘要
在地理栅格并行计算处理中,数据I/O已成为制约计算性能的主要瓶颈之一。本文针对该问题,首先分析广泛应用于GIS栅格数据存储的Geo TIFF格式,重点研究数据的2种存储模式(即条带存储与块状存储),并根据这2种存储方式,分别构建了栅格数据从逻辑结构向物理存储结构的映射模型。然后,针对地理空间并行计算的需要,提出了栅格数据的并行读写框架,并利用MPI并行I/O技术的文件视图方法,实现了Geo TIFF数据并行I/O库(p GTIOL)。结果表明,对比开源栅格空间数据转换库(GDAL)的主从I/O模式,本文提出的p GTIOL准确读写数据,具有更高的性能。该库隐藏了底层并行I/O的细节,提供简单易用的并行读写Geo TIFF栅格数据的接口,支持多数据类型和多种空间分割,实现了对条带存储与块状存储数据的异步并行读写,从而满足动态负载均衡的需求。
Data I/O has become one of the main bottlenecks for parallel geospatial computing. In this study, we firstly explore the data structure of a widely used GIS raster data format-GeoTIFF, particularly focusing on its storage modes (strip storage and tile storage). The transfer functions which map the logical structure of data to the physical storage structure were constructed for both storage modes.This article also designs a framework for parallel I/O of raster data and implementsa parallel GeoTIFF I/O library (pGTIOL) using the file-view technique of MPI-IO. Experimental results showed that pGTIOL effectively enhances the I/O performance in comparison with the master-worker I/O mode which uses the Geospatial Data Abstraction Library (GDAL). pGTIOL encap- sulates the underlying parallel I/O routines, and provides easy-to-use interfaces for the parallel reading and writ- ing of GeoTIFF data. Compared with other parallel raster I/O software packages, pGTIOL supports a wide range of data types, both the strip and tile data storage modes, and various domain decomposition methods. Most im- portantly, pGTIOL supports asynchronous parallel I/0, which allows multiple processes to read and write sub-do- mains of data on demand.Hence,it could facilitate dynamic load-balancing in application.
出处
《地球信息科学学报》
CSCD
北大核心
2015年第5期575-582,共8页
Journal of Geo-information Science
基金
教育部高等学校博士学科点专项科研基金(20130145120013)