摘要
针对Hadoop Database(Hbase)仅支持主索引结构,即通过主键和主键的range来检索数据的问题,提出利用Counting Bloom Filter的新变体建立二级索引来支持非主键数据的检索.分析了已有的Counting Bloom Filter(CBF)技术,针对CBF溢出概率高的问题,提出一种新的Split Counting Bloom Filter(SCBF)技术,SCBF将标准CBF分成多个相互独立的区域,由这多个区域共同存储元素的fingerprint.实验结果表明,与标准CBF相比,SCBF降低了溢出概率,充分提高了过滤器的性能,可以很好地用来建立Hbase二级索引.
A new variant of Counting Bloom Filter was set up to build Hbase secondary index to support the retrieval of non-primary key data, which solved the problem that Hbase only supported the main index structure and retrieve data through the primary key and the primary key range. The new variant, Split Counting Bloom Filter(SCBF), was proposed according to the high overflow probability problem of Counting Bloom Filter(CBF) after analyzing existing CBF technology. SCBF divided standard CBF into multiple independent regions, which stored elements' fingerprint by all these areas. Comparing SCBF with CBF, the experimental result shows that, SCBF contributes to much lower overflow probability, which improves the performance of filter, and can be used to build the Hbase secondary index.
出处
《计算机系统应用》
2016年第3期119-123,共5页
Computer Systems & Applications