摘要
针对Spark引擎不支持多维空间查询的问题,提出基于R树的二级空间索引,即在每个Worker节点上创建R子树,并将这些子树作为孩子,在Master节点上创建R树。针对LRU算法内存替换粒度粗、结果不够精确的问题,提出基于数据使用权重的内存替换方法。该方法将每次实际使用数据量与其总量的比值作为替换权重,将热点场景数据以RDD形式持久化至内存中,提高了基于内存查询的效率。根据远粗近细的视觉原理提出细节层次查询,该方法将最能代表物体特征的点云数据先传输给客户端,或者仅把简化模型点数据传给客户端,以解决网络带宽不足和数据加载延迟的问题。实验证明,文中方法能有效解决Spark多维空间的查询问题,查询效率得到了明显提高。
Two level spatial index based on R tree was presented according to the problem that spark engine doesn’t support multi-dimensional spatial query,that is,the R subtree is created on each worker node,and these subtrees are used as children to create the R tree on the master node.Memory replacement granularity of LRU algorithm is coarse,and the result is not accurate enough.For this reason,the method of memory replacement based on data usage weight was proposed.The ratio of actual used amount of data and its total amount is used as replacement weight.The method stores the hot scene data in RDD form into memory and improves the query efficiency based on memory.According to the visual principle of far thick and near fine,the level of detail query was presented.The point cloud data that best represent the object characteristics are firstly transmitted or the simplified model data are only transmitted to the client,so as to solve the problem of insufficient network bandwidth and data loading delay.Experimental results show that the proposed method can effectively solve the problem of multi-dimensional spatial query on spark,and the query efficiency is improved obviously.
作者
赵尔平
孟小峰
ZHAO E r-ping;MENG Xia o-feng(School of Information Engineering,Xizang Minzu University,Xianyang,Shaanxi 712082,China;School of Information,Renmin University of China,Beijing 100872,China)
出处
《计算机科学》
CSCD
北大核心
2018年第9期213-219,共7页
Computer Science
基金
国家自然科学基金(61762082)
西藏自治区自然科学基金(12KJZRYMY07)资助