摘要
为了解决基于"键-值"模型的云存储环境仅支持简单的关键字查询,不支持多维空间查询的问题,提出了一种新的分布式空间索引方法——M-Quadtree索引。在索引构建过程中,设计了一种基于改进四叉树的空间数据划分方法,该方法规定了叶节点区域的最小数据量,通过四叉树叶节点的再合并,解决了划分后各子区域间存储量不平衡的问题,并且满足了MapReduce并行化要求。给出了MapReduce框架下M-Quadtree索引的快速构建、查询与更新算法,并在搭建的Hadoop平台进行了关键参数对索引效率的影响以及不同规模数据下索引的创建、查询和更新试验。与现有分布式空间索引的对比试验及分析结果表明,M-Quadtree索引在数据存储量负载均衡、算法并行化和空间查询效率等方面表现得更好。
Currently, the cloud storage platform based on key-value model can only support simple keyword queries but cannot support multidimensional spatial queries. To solve the problem, this paper puts forward a new method of distributed spatial index-M-Quadtree index. In the process of index building, a space partitioning method based on improved quadtree was proposed. This partitioning method specifies the minimum amount of data in the leaf area. By recombining the quad leaves, it solves the problem of storage imbalance among sub regions, and meets the parallel requirements of the MapReduce. This paper describes some algorithms about M-Quadtree index building,querying and updating under the MapReduce framework. In the experiments, we implement the M-Quadtree index on Hadoop platform to test the effect of key parameter on the efficiency of index, and also test the efficiency of index building, querying and updating under different scale of data. Comparing with existing distributed spatial index, experiments show that the M-Quadtree index performs better on data load balancing, algorithm parallelism and the efficiency of spatial querying.
出处
《测绘学报》
EI
CSCD
北大核心
2016年第11期1342-1351,共10页
Acta Geodaetica et Cartographica Sinica