This paper proposes a semi-greedy framework for optimizing multi-joinqueries in shared-nothing systems. The plan generated by the framework com-prises several pipelines, each performing several joins. The framework de...This paper proposes a semi-greedy framework for optimizing multi-joinqueries in shared-nothing systems. The plan generated by the framework com-prises several pipelines, each performing several joins. The framework deter-mines the 'optimal' number of joins to be performed in each pipeline. Thedecisions are made based on the cost estimation of the entire processing plan.Two ekisting optimization algorithms are extended under the framework. Ananalytical model is presented and used to compare the quality of plans producedby each optimization algorithm. Our study shows that the new algorithms out-perform their counterparts that are not extended.展开更多
文摘This paper proposes a semi-greedy framework for optimizing multi-joinqueries in shared-nothing systems. The plan generated by the framework com-prises several pipelines, each performing several joins. The framework deter-mines the 'optimal' number of joins to be performed in each pipeline. Thedecisions are made based on the cost estimation of the entire processing plan.Two ekisting optimization algorithms are extended under the framework. Ananalytical model is presented and used to compare the quality of plans producedby each optimization algorithm. Our study shows that the new algorithms out-perform their counterparts that are not extended.
文摘联机分析处理OLAP(online analytical processing)查询作为一种复杂查询,当使用SQL(structured query language)语句来表述时,通常都包含多表连接和分组聚集操作,因此提高多表连接和分组聚集计算的性能就成为ROLAP(relational OLAP)查询处理的关键问题.提出一种基于分组序号的聚集算法MuGA(group number based aggregation with multi-table join),该方法充分考虑数据仓库星型模式的特点,将聚集操作和新的多表连接算法MJoin(multi-table join)相结合,使用分组序号进行分组聚集计算,代替通常的排序或者哈希计算,从而有效地减少CPU运算以及磁盘存取的开销.算法的实验数据表明,提出的MuGA算法与传统的关系数据库聚集查询处理方法以及改进后的基于排序的聚集算法相比,性能都有显著提高.