摘要
为了实现对海量数据的高效存储和查询,众多NoSQL数据库被开发出来,HBase是其中之一。但原生的HBase数据库在进行数据查询时只支持主键索引,对非主键数据只能通过全表扫描的方式进行查询,极大降低了HBase的多条件查询速度。为此,提出了基于协处理器的HBase内存索引构建方案,通过协处理器实现对二级索引的快速构建并可根据HBase表的变化自动更新索引。同时,将建立的索引进行持久化操作,在使用时通过内存计算,极大地提高了索引数据检索速度,保证了索引的可用性和容错性。实验结果表明,该方案相比原生数据库的条件检索速度有了极大提升,相比于基于Solr和HiBase的二级索引方案检索速度也有所提升。
In order to achieve efficient storage and query of massive data,many NoSQL databases have been developed,and HBase is one of them.However,the native HBase database only supports the primary key index when performing data query,and the non-primary key data can only be queried by means of full table scan,which greatly reduces the multi-condition query speed of HBase.To this end,a HBase memory index construction scheme based on coprocessor is proposed.The coprocessor is used to quickly construct the secondary index and the index can be automatically updated according to the change of the HBase table.At the same time,the established index is persisted,and the memory calculation is used in use,which greatly improves the retrieval speed of the index data,and ensures the availability and fault tolerance of the index.Experiments show that the condition retrieval speed of the scheme is greatly improved compared with the original database,and the retrieval speed of the secondary index scheme based on Solr and HiBase is also improved.
作者
朱松杰
娄渊胜
叶枫
李凌
陈勇
ZHU Songjie;LOU Yuansheng;YE Feng;LI Ling;CHEN Yong(Department of Computer and Information,Hohai University,Nanjing 211100,China;Postdoctoral Centre,Nanjing Longyuan Micro-Electronic Company,Nanjing 211106,China)
出处
《计算机工程与应用》
CSCD
北大核心
2020年第1期98-105,共8页
Computer Engineering and Applications
基金
2017江苏省博士后科研资助计划(No.1701020C)
2017江苏省“六大人才高峰”资助项目(No.XYDXX-078)
中央高校基本业务费(No.2013B01814)