期刊文献+

适用于范围查询的列存储数据桶划分算法 被引量:3

A Column-Store Based Bucket Partition Algorithm for Range Queries
下载PDF
导出
摘要 范围查询是数据库中一项重要的操作.列存储数据库中,能否有效查找一个范围内的属性值,获取对应的行号集合,将极大影响元组重构的效率.与树型结构相比,Hash表对数据的精确查找具有更高的效率,但是范围查找的效率比较低.针对这种情况,提出了一种改进的可用于范围查询的数据桶划分算法.为了能够更好地对算法进行描述,首先提出了可用于范围查询的Hash存储模型(rangedHash,RH),并给出了桶的值域和序列化的定义.其次针对列存储等"读优先"特性,在RH模型的基础上,提出一种改进的桶划分算法.该算法生成可序列化的哈希函数把属性值划分到桶中,能够同时提高属性值的范围查询效率和存储效率.最后,通过实验结果验证算法的有效性. Range query is significant to databases. In a column-store database, using range queries on attribute values to obtain the resulting row-id set, would affect the performance of tuple reconstruction. Compared with tree structure, Hash tables are more effective in exact queries but less effective in range queries. With this situation, a bucket partition algorithm for range queries is proposed. Firstly, In order to give a good introduction to the algorithm, a Hash storage model used for range queries (ranged hash, RH) is proposed, along with the definition of the bucket range and the serialization. Then, according to the "read-optimized" feature of column store databases, an improved bucket partition algorithm used for range queries is proposed based on the RH model. The algorithm could generate serializable Hash functions to partition attribute values into buckets, and could improve not only the efficiency of range queries but also the storage efficiency. Finally, the experimental results prove the efficiency of the algorithm.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第3期594-601,共8页 Journal of Computer Research and Development
基金 "核高基"重大科技专项基金项目(2010ZX01042-001-003-004) 国家自然科学基金项目(61070031 61070032) 上海市自然科学基金项目(11ZR1401200)
关键词 列存储 范围查询 HASH表 可序列化 桶划分 column-store range query Hash table serializable bucket partition
  • 相关文献

参考文献22

  • 1Abadl D J, Bonez P A, Harizopoulos S. Column oriented database systems [C]//Proc of the 35th VLDB Conf. Trondheim, Norway: VLDB Endowment, 2009:1664-1665.
  • 2Zukowski M, Boncz P A, Nes N, et al. Monet DB/X100- A DBMS in the CPU cache [J]. IEEE Data Engineering Bulletin, 2005, 28(2): 17-22.
  • 3Stonebraker M, Abadi D J, Batkin A, et al. C-Store: A column-oriented DBMS [C]//Proc of the 31st VLDB Conf. Trondheirn, Norway: VLDB Endowment, 2005:553-564.
  • 4MacNicol R, French B. Sybase IQ Multiplex-Designed For Analytics [C] //Proc of the 30th VLDB Conf. Trondheim, Norway: VLDB Endowment, 2004 :1227-1230.
  • 5Abadi D J, Madden S, Hachem N. Column-stores vs. row-stores: How different are they really? [C]//Proc of the 2008 ACM SIGMOD Int Conf. New York: ACM, 2008:967-980.
  • 6Garcia-Molina H, Ullman J D, Widom J. Database System Implementation [M]. Upper Saddle River, NJ, Prentice- Hall, 2000.
  • 7Chan C Y, Ioannidis Y E. Bitmap index design and evaluation [C] //Proc of the 2008 ACM SIGMOD Int Conf. New York: ACM, 1998:355-366.
  • 8Olson M A, Bostie K, Seltzer M I. Berkeley DB [C] //Proc of the 10th USENIX Annual Technical Conf. Berkeley, CA USENIX, 1999: 183-191.
  • 9Maurer W D, Lewis T G. Hash table methods [J]. ACM Computing Surveys(CSUR), 1975, 7(1) : 5-19.
  • 10Litwin W. Linear hashing: A new tool for file and table addressing [C]//Proc of the 6th VLDB Conf. Los Alamitos, CA: IEEE Computer Society, 1980:212-223.

二级参考文献2

共引文献1

同被引文献18

引证文献3

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部