期刊文献+

基于NoSQL的海量空间数据云存储与服务方法 被引量:61

Massive Geo-spatial Data Cloud Storage and Services Based on NoSQL Database Technique
原文传递
导出
摘要 近年来,实现海量空间数据高效地存储管理和在线服务,成为地学信息科学领域日益关注的热点问题。本文根据矢量和栅格空间数据的不同特点,提出并实现了矢量栅格数据一体化的海量空间数据分布式云存储管理与访问服务方案,在海量矢量数据存储和处理中创新性引入分布式图数据库Neo4J和并行图计算框架。在三层式空间数据云存储架构基础上,给出NoSQL数据库技术的栅格和矢量数据云存储的实现策略与方法,并开展了通用数据访问接口的设计。采用分布式文件系统HDFS存储栅格数据,并使用列族数据库HBase对其建立分布式空间索引,及采用满足ACID约束的分布式图数据库Neo4J来存储矢量数据,并使用R树建立空间索引。在自主研发的地理知识云平台GeoKSCloud框架下,初步实现了核心组件-空间数据聚合中心(GeoDAC)软件,可为各类用户提供空间数据分布式存储管理和访问服务。通过搭建试验床,开展GeoDAC与开源GIS软件PostGIS在矢量数据读写访问性能方面的对比测试。结果表明,虽然GeoDAC没有获得写入性能的加速作用,但其具有PostGIS无法比拟的强大读取性能。GeoDAC将海量数据经过空间分割后分布在集群上,能够并行处理查询请求,极大地提高空间查询速度,具有广阔的应用前景。 In recent years, how to implement a efficient storage management on massive geo-spatial data and ulteriorly web service for a broad variety of users, has becomes an increasingly hot issue in the field of geographical information science, with the explosive growth of Earth Observation System(EOS) data and the flourish of the new geography paradigm. A cloud storage system to provide distributed cloud-enabled storage management and services for massive geo-spatial data with an integrity of both vector and raster formats is proposed in this paper in the light of their intrinsic differences. Based on three-tier layer architecture, we put forward its implementation strategy and method of cloud storage management for raster and vector data respectively based on NoSQL database system, followed by a universal data access interface. The novel technolgies, which include dis- tribute graph database-Neo4J and parralel graph compute framework on massive vector data storage and process were introduced. In our research, using the distributed file system-HDFS and the column family database-HBase as a container to store massive raster data with a distributed space index technique, and the distributed graph data- base system-Neo4J is used to store massive vector data in view of the constraints of ACID with a R-tree space in- dex. Under the unified framework of Geographical Knowledge Cloud platform GeoKSCloud developed by our research group as a successor of GeoKSCloud, its core components -- spatial data aggregation centre (GeoDAC) software has been in shape with aim to provide some distributed spatial data storage management and access services for all types of end users. A tesbed is established with serveral 5 physical nodes and accordingly 7 virtual nodes with different areas and operational systems. We carried out an elaborate comparison between GeoDAC and open source GIS software -- PostGIS to validate vector data reading & writing performance. The preliminary results indicated that, although GeoDAC has no accelerated write performance than PostGIS, but it gains significant powerful reading or spatial query performance than PostGIS. Inside GeoDAC, space-partitioned massive data is distributed on the cluster and spatial query operation is implemented in parallel, consequently an enhanced rate of spatial query is gained. The achieved techniques and system in our work will provide a variety of users a powerful tool for further in-depth processing and owns a broad application prospects.
出处 《地球信息科学学报》 CSCD 北大核心 2013年第2期166-174,共9页 Journal of Geo-information Science
基金 国家科技支撑计划项目(2013BAH28F00) 福建省科技计划项目(2010I0008,2010HZ0004-1) 欧盟第七框架国际合作项目(FP7-2009-People-IRSES,No.247608)
关键词 空间数据 云存储 NOSQL 地理知识云 数据聚合中心 geo-spatial data vector data cloud-enabled data storage NoSQL Geographical Knowledge Cloud data aggregation access service
  • 相关文献

参考文献8

二级参考文献113

共引文献731

同被引文献617

引证文献61

二级引证文献474

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部