期刊文献+

列存储数据仓库查询执行中重用缓冲区调度算法 被引量:6

Scheduling Algorithm for the Reuse Buffers in Column-Store Data Warehouse Query Execution
下载PDF
导出
摘要 查询的中间结果重用是提高查询效率的重要手段.现有列存储系统主要关注多查询计划间的中间结果重用,忽略了单一查询计划执行过程中大量可重复访问的中间结果.单一查询中的中间结果具有确定性高、结果大小可估计的特征,非常适合作为重用的对象.为此,针对列存储数据仓库单一查询计划执行过程中的中间结果重用问题,提出了一个重用缓冲区空间的调度算法.首先,基于操作结点在给定物理执行计划树中的相对位置及其操作所产生的中间结果的大小对操作结点提出重用度估计模型.其次,设计了基于模型估计结果的缓冲区调度算法.在每一个查询计划的执行过程中,根据其模型估计结果执行缓冲区调度算法,使得其产生的中间结果中更重要的部分能够更久地驻留在内存中,以提升查询性能.在数据仓库基准数据集SSB上的实验结果验证了方法的有效性. Reusing intermediates is an important way to improve the performance of query execution. The current column-store systems mainly focus on the reusage of the intermediates in multiple query plans, while large quantities of reusable intermediates in a single query are neglected. The intermediates of a single query are suitable for reusing during the process of execution due to the characteristics of their high certainty and the evaluable amount. To deal with this problem, a novel scheduling algorithm for the reuse buffers is proposed. Firstly, we propose a reusability estimation model based on the relative position of the given operator node in the physical execution tree as well as the estimated volume of the intermediates it produces during execution. Then, we provide the reuse buffer scheduling algorithm based on the results of the reusability estimation model. In the process of query execution for each query plan, the scheduling algorithm is executed on the basis of the results of its reusability estimation model, making the more important intermediates stay longer in the memory than the others, leading the improvement of query performance. The experimental results on the benchmark data set SSB verify the effectiveness of the proposed method.
出处 《计算机研究与发展》 EI CSCD 北大核心 2011年第10期1942-1950,共9页 Journal of Computer Research and Development
基金 "核高基"国家科技重大专项基金项目(2010ZX01042-001-003-004) 国家自然科学基金项目(61070031 61070032) 上海市自然科学基金项目(11ZR1401200)
关键词 列存储 查询执行 中间结果重用 重用缓冲区 数据仓库 column-store query execution reuse intermediate reuse buffer data warehouse
  • 相关文献

参考文献13

  • 1Inmon W H. Building the Data Warehouse [M]. New York: Wiley, 2005.
  • 2Harizopoulos S, Liang V, Abadi D J, et ai. Performance tradeoffs in read-optimized databases [C] //Proc of the 32nd VLDB Conf. Trondheim, Norway: VLDB Endowment, 2006: 487-498.
  • 3Carcia-Molina H, Ullrnan J D, Widom J. Database System Implementation [M]. Englewood Cliffs, NJ: Prentice Hall, 2000.
  • 4Ailamaki A, DeWitt D J, Hill M D, et al. Weaving relations for cache performance [C] //Proc of the 27th VLDB Conf. San Francisco: Morgan Kaufmann, 2001:169-180.
  • 5Boβwetter D. SPAX-PAX with super-pages [C] //Proc of the 13th East European Conf on Advances in Databases and Information Systems. Berlin: Springer, 2009:362-377.
  • 6Abadi D J, Madden S R, Haehem N. Column-stores vs. row stores: How different are they really? [C] //Proc of the 2008 ACM SIGMOD Int Conf. New York: ACM, 2008: 967-980.
  • 7Ivanova M G, Kersten M L, Nes N J, et al. An architecture for recycling intermediates in a column-store [C] //Proc of ACMSIGMOD'09. NewYork:ACM, 2009:309-320.
  • 8Boncz P, Zukowski M, Nes N. MonetDB/X100: Hyper- pipelining query execution [C] //Proc of the 2005 CIDR Conf. New York: ACM, 2005:225-237.
  • 9Boncz P A. Monet: A next-generation DBMS kernel for query intensive applications[D]. Amsterdam: Universiteit van Amsterdam, 2002.
  • 10Trondheim, Norway, Mike Stonebraker, Daniel J Abadi, et al. C-Store: A column oriented DBMS [C] //Proc of the 31st VLDB Conf. New York: ACM, 2005:553-564.

二级参考文献10

  • 1Mattson R L,,Gecsei J,Slutz D R,et al.Evaluation techniques for storage hierarchies[].IBM Systems Journal.1970
  • 2Corbato F J.A paging experiment with the multics system. MIT Project MAC Report MAC- M- 384 . 1968
  • 3Nicola V F,Dan A,Dias D M.Analysis of the general- ized clock buffer replacement scheme for database trans- action processing[].Proceedings of ACM SIGMET- RICS Conference on Measuring and Modeling of Computer Systems.1992
  • 4Robinson J T,Devarakonda M V.Data cache manage- ment using frequency- based replacement[].Proceedings of ACM SIGMETRICS Conference on Measuring and Modeling of Computer Systems.1990
  • 5O′Neil E J,O′Neil P E,Weikum G.The LRU- K page replacement algorithm for database disk buffering[].Pro- ceedings of ACM SIGMOD Conference.1993
  • 6Johnson T,Shasha D.2Q:a low overhead high perfor- mance buffer management replacement algorithm[].Pro- ceedings of VLDB Conference.1994
  • 7Lee D,Choi J,Kim J H,et al.LRFU:a spectrum of policies that subsumes the least recently used and leastfrequently used policies[].IEEE Transactions on Comput- ers.2001
  • 8Megiddo N,Modha D S.ARC:a self- tuning,low over- head replacement cache[].Proceedings of the nd USENIX Conference on File and Storage Technologies.2003
  • 9Jiang S,Zhang X.LIRS:an efficient low inter- reference recency set replacement policy to improve buffer cache performance[].Proceedings of ACM SIGMETRICS Conference on Measuring and Modeling of Computer Sys- tems.2002
  • 10Bansal S,Modha D.CAR:clock with adaptive replace- ment[].Proceedings of the rd USENIX Symposium on File and Storage Technologies.2004

共引文献2

同被引文献54

引证文献6

二级引证文献16

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部