摘要
现有数据方(Cube)物化选择算法中,一般采用查询频率分布和物化成本模型来指导物化选择,忽视了查询结果最终是为系统决策支持使用。实际系统中,Cube查询结果的子集就可以满足决策支持需要,这使得查询结果集的部分决策冗余;据此,将完全Cube约简为D-Cube。该文提出面向决策支持的物化Cube选择方案,将系统决策支持需求分解为查询序列集,定义查询的决策贡献度;根据查询决策贡献度、物化成本和查询频率来选择物化Cube。通过理论分析和实验结果表明,该方案在海量数据仓库及数据流Cube中优于其他选择方案。
Generally,distribution of query frequency and model of materialization cost is usually used for cube materialization selection in the existing algorithms.However,those algorithms ignore that queries of cube is finally used by decision support.In many decision support systems,subset of Cube query results is sufficient to decision support,and a part of query results are redundancy.So,we define Decision Cube(D-Cube) to reduce the redundancy of full-materialized cube.In this paper,we propose an approach which analyzes the demand of decision support into query sequences,and define support degree of each query.By this means,the materialized cube can be significantly selected.Through theoretical analysis and extensive experiments,it is shown that this approach has better effective on the very large data warehouses and stream cube.
出处
《武汉理工大学学报》
CAS
CSCD
北大核心
2010年第20期16-21,共6页
Journal of Wuhan University of Technology
基金
国家高技术研究发展计划(863)(2007AA01Z474
2007AA010502)