摘要
针对传统的关系型空间数据库已经不能很好地适应于超大规模高并发空间查询访问的处理需要的问题,该文着眼于解决大数据时代下地理信息服务所面临的日益严峻的大规模空间查询访问需求,探索了一套基于Spark架构的空间查询实现技术,并给出相应的解决方案。提出一个基于Spark并提供类SQL访问接口的空间查询实现模型GeoSpark SQL,解决了以下关键问题:数据的外包矩形数据生成和标准地理信息数据对Spark的导入导出方法;Spark空间查询算子实现方法;Spark空间索引与查询优化方法。GeoSpark SQL模型在初步实验中,已可以满足实时性的要求,对复杂的空间查询也能有良好的性能表现。
For the traditional relational spatial database has been unable to meet the requirements of large scale and high concurrent access,this paper aimed to solve the increasingly large-scale spatial query access in the era of big data,and a set of methods and solutions of spatial query based on Spark was explored.An implementation model,GeoSpark SQL,based on Spark which provides SQL interface of spatial query were proposed,following key issues were researched and solved:the generation of bounding box to column and the import and export method of standard geographic spatial data;the expansion method for the access of spatial relationship operators in spatial query based on Spark;the accelerating method of spatial query parameter and the local cache of geometry deserialization.The model of GeoSpark SQL had been able to meet the demands of instantaneity in the preliminary experiments,which had a good performance in complex spatial join.
出处
《测绘科学》
CSCD
北大核心
2016年第12期273-278,共6页
Science of Surveying and Mapping