摘要
针对HBase仅在行键上进行索引优化而非行键列查询的问题,提出一种基于协处理器的HBase分类二级索引方案。设计基于协处理器的索引管理和并行查询机制:利用Ob-server在内存中建立并维护索引,同时利用Endpoint设计并行查询算法,进而提升非行键列的查询性能。由于数据特征和查询需求决定了构建索引的类型,进一步设计分类内存索引模型,用以平衡查询性能和索引性能。在出租车GPS数据集上的实验结果表明:相较于基于Solr和Hi-Base的二级索引方案具有更好的整体性能。
Aiming at the problem that HBase only optimizes index for rowkey and the query performance of non-key column is low,a coprocessor-based HBase classification secondary index solution is proposed.In this solution,a coprocessor-based index management and parallel query mechanism is proposed.The index is established and maintained in memory by Observer.At the same time,a parallel query algorithm is designed based on Endpoint to improve the query performance.Since data characteristics and query conditions determine the type of index,a classified index model is further proposed to balance query performance and index performance.The results on taxi GPS dataset show that the overall performance of our solution is improved compared with the Solr-based scheme and HiBase.
作者
陈顺举
邹喆
刘锐
陶涛
汪超
郑林江
CHEN Shunju;ZOU Zhe;LIU Rui;TAO Tao;WANG Chao;ZHENG Linjiang(Chongqing Public Security Bureau Yubei Branch Traffic Patrol Police Detachment,Chongqing 401120,China;College of Computer Science,Chongqing University,Chongqing 400044,China)
出处
《重庆理工大学学报(自然科学)》
CAS
北大核心
2021年第4期142-151,200,共11页
Journal of Chongqing University of Technology:Natural Science
基金
国家重点研究计划课题(2017YFC0805200)
国家重点研究计划课题(2016YFC0801707)。