Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal v...Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal views that can respond to more queries simultaneously.This work introduces a combined approach in which the constraint handling process is combined with metaheuristics to select the most optimal subset of DW views from DWs.The proposed work initially refines the solution to enable a feasible selection of views using the ensemble constraint handling technique(ECHT).The constraints such as self-adaptive penalty,epsilon(ε)-parameter and stochastic ranking(SR)are considered for constraint handling.These two constraints helped the proposed model select the finest views that minimize the objective function.Further,a novel and effective combination of Ebola and coot optimization algorithms named hybrid Ebola with coot optimization(CHECO)is introduced to choose the optimal MVs.Ebola and Coot have recently introduced metaheuristics that identify the global optimal set of views from the given population.By combining these two algorithms,the proposed framework resulted in a highly optimized set of views with minimized costs.Several cost functions are described to enable the algorithm to choose the finest solution from the problem space.Finally,extensive evaluations are conducted to prove the performance of the proposed approach compared to existing algorithms.The proposed framework resulted in a view maintenance cost of 6,329,354,613,784,query processing cost of 3,522,857,483,566 and execution time of 226 s when analyzed using the TPC-H benchmark dataset.展开更多
To efficiently solve the materialized view selection problem, an optimal genetic algorithm of how to select a set of views to be materialized is proposed so as to achieve both good query performance and low view maint...To efficiently solve the materialized view selection problem, an optimal genetic algorithm of how to select a set of views to be materialized is proposed so as to achieve both good query performance and low view maintenance cost under a storage space constraint. First, a pre-processing algorithm based on the maximum benefit per unit space is used to generate initial solutions. Then, the initial solutions are improved by the genetic algorithm having the mixture of optimal strategies. Furthermore, the generated infeasible solutions during the evolution process are repaired by loss function. The experimental results show that the proposed algorithm outperforms the heuristic algorithm and canonical genetic algorithm in finding optimal solutions.展开更多
The data warehouse is the most widely used database structure in many decision support systems around the world. This is the reason why a lot of research has been conducted in the literature over the last two decades ...The data warehouse is the most widely used database structure in many decision support systems around the world. This is the reason why a lot of research has been conducted in the literature over the last two decades on their design, refreshment and optimization. The manipulation of hypercubes (cubes) of data is a frequently used operation in the design of multidimensional data warehouses, due to their better adaptation to OLAP (On-Line Analytical Processing). However, the updating of these hypercubes is a very complicated process due mainly to the mass and complexity of the data presented. The purpose of this paper is to present the state of the art of works based on multidimensional modeling using the hypercube as a unit of presentation of data stores. It starts with the base of this process which is the choice of the views (cubes) forming our data warehouse base. The objective of this work is to describe the state of the art of research works dealing with the selection of materialized views in decision support systems.展开更多
Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur ...Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur the following disadvantages: 1) they cannot discover enough potential cached views sufficiently to effectively answer subsequent queries; or 2) they are inefficient for view selection due to the complexity of XPath expressions. In this paper, we propose SCEND, an effective Semantic Cache based on dEcompositioN and Divisibility, to exploit the XPath query/view answerability. The contributions of this paper include: 1) a novel technique of decomposing complex XPath queries into some much simpler ones, which can facilitate discovering more potential views to answer a new query than the existing methods and thus can adequately exploit the query/view answerability; 2) an efficient view-section method by checking the divisibility between two positive numbers assigned to queries and views; 3) a cache-replacement approach to further enhancing the query/view answerability; 4) an extensive experimental study which demonstrates that our approach achieves higher performance and outperforms the existing state-of-the-art alternative methods significantly.展开更多
文摘Responding to complex analytical queries in the data warehouse(DW)is one of the most challenging tasks that require prompt attention.The problem of materialized view(MV)selection relies on selecting the most optimal views that can respond to more queries simultaneously.This work introduces a combined approach in which the constraint handling process is combined with metaheuristics to select the most optimal subset of DW views from DWs.The proposed work initially refines the solution to enable a feasible selection of views using the ensemble constraint handling technique(ECHT).The constraints such as self-adaptive penalty,epsilon(ε)-parameter and stochastic ranking(SR)are considered for constraint handling.These two constraints helped the proposed model select the finest views that minimize the objective function.Further,a novel and effective combination of Ebola and coot optimization algorithms named hybrid Ebola with coot optimization(CHECO)is introduced to choose the optimal MVs.Ebola and Coot have recently introduced metaheuristics that identify the global optimal set of views from the given population.By combining these two algorithms,the proposed framework resulted in a highly optimized set of views with minimized costs.Several cost functions are described to enable the algorithm to choose the finest solution from the problem space.Finally,extensive evaluations are conducted to prove the performance of the proposed approach compared to existing algorithms.The proposed framework resulted in a view maintenance cost of 6,329,354,613,784,query processing cost of 3,522,857,483,566 and execution time of 226 s when analyzed using the TPC-H benchmark dataset.
文摘To efficiently solve the materialized view selection problem, an optimal genetic algorithm of how to select a set of views to be materialized is proposed so as to achieve both good query performance and low view maintenance cost under a storage space constraint. First, a pre-processing algorithm based on the maximum benefit per unit space is used to generate initial solutions. Then, the initial solutions are improved by the genetic algorithm having the mixture of optimal strategies. Furthermore, the generated infeasible solutions during the evolution process are repaired by loss function. The experimental results show that the proposed algorithm outperforms the heuristic algorithm and canonical genetic algorithm in finding optimal solutions.
文摘The data warehouse is the most widely used database structure in many decision support systems around the world. This is the reason why a lot of research has been conducted in the literature over the last two decades on their design, refreshment and optimization. The manipulation of hypercubes (cubes) of data is a frequently used operation in the design of multidimensional data warehouses, due to their better adaptation to OLAP (On-Line Analytical Processing). However, the updating of these hypercubes is a very complicated process due mainly to the mass and complexity of the data presented. The purpose of this paper is to present the state of the art of works based on multidimensional modeling using the hypercube as a unit of presentation of data stores. It starts with the base of this process which is the choice of the views (cubes) forming our data warehouse base. The objective of this work is to describe the state of the art of research works dealing with the selection of materialized views in decision support systems.
基金supported by the National Natural Science Foundation of China under Grant No.60873065the National High Technology Research and Development 863 Program of China under Grant Nos.2007AA01Z152 and 2009AA011906the National Basic Research 973 Program of China under Grant No.2006CB303103.
文摘Maintaining a semantic cache of materialized XPath views inside or outside the database is a novel, feasible and efficient approach to facilitating XML query processing. However, most of the existing approaches incur the following disadvantages: 1) they cannot discover enough potential cached views sufficiently to effectively answer subsequent queries; or 2) they are inefficient for view selection due to the complexity of XPath expressions. In this paper, we propose SCEND, an effective Semantic Cache based on dEcompositioN and Divisibility, to exploit the XPath query/view answerability. The contributions of this paper include: 1) a novel technique of decomposing complex XPath queries into some much simpler ones, which can facilitate discovering more potential views to answer a new query than the existing methods and thus can adequately exploit the query/view answerability; 2) an efficient view-section method by checking the divisibility between two positive numbers assigned to queries and views; 3) a cache-replacement approach to further enhancing the query/view answerability; 4) an extensive experimental study which demonstrates that our approach achieves higher performance and outperforms the existing state-of-the-art alternative methods significantly.