Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made availabl...Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.展开更多
Collecting statistics is a time-and resource-consuming operation in database systems.It is even more challenging to efficiently collect statistics without affecting system performance,meanwhile keeping correctness in ...Collecting statistics is a time-and resource-consuming operation in database systems.It is even more challenging to efficiently collect statistics without affecting system performance,meanwhile keeping correctness in distributed database.Traditional strategies usually consider one dimension during collecting statistics,which is lack of adaptiveness.In this paper,we propose an adaptive strategy for statistics collecting(ASC),which well balances collecting efficiency,correctness of statistics and effect to system performance.We formally define the procedure of collecting statistics and abstract the relationships among collecting efficiency,correctness of statistics and effect to system performance,and introduce an elastic structure(ESI)storing necessary information generated during proceeding our strategy.ASC can pick appropriate time to trigger collecting action and filter unnecessary tasks,meanwhile reasonably allocating collecting tasks to appropriate executing locations with right executing models through the information stored at ESI.We implement and evaluate our strategy in a distributed database.Experiments show that our solutions generally improve the efficiency and correctness of collecting statistics,moreover,reduce the negative effect to system performance comparing with other strategies.展开更多
文摘Dynamic programming(DP) is an effective query optimization approach to select an appropriate join order for relational database management system(RDBMS) in multi-table joins. This method was extended and made available in distributed DBMS(D-DBMS). The structure of this optimal solution was firstly characterized according to the distributing status of tables and data, and then the recurrence relations between a problem and its sub-problems were recursively defined. DP in D-DBMS has the same time-complexity with that in centralized DBMS, while it has the capability to solve a much more sophisticated optimal problem of multi-table join in D-DBMS. The effectiveness of this optimal strategy has been proved by experiments.
基金This project was supported by Key Research and Development Program(2018YFB1003403)the National Natural Science Foundation of China(Grant Nos.61732014,61672432,61672434)Natural Science Basic Research Plan in Shaanxi Province of China(2017JM6104).
文摘Collecting statistics is a time-and resource-consuming operation in database systems.It is even more challenging to efficiently collect statistics without affecting system performance,meanwhile keeping correctness in distributed database.Traditional strategies usually consider one dimension during collecting statistics,which is lack of adaptiveness.In this paper,we propose an adaptive strategy for statistics collecting(ASC),which well balances collecting efficiency,correctness of statistics and effect to system performance.We formally define the procedure of collecting statistics and abstract the relationships among collecting efficiency,correctness of statistics and effect to system performance,and introduce an elastic structure(ESI)storing necessary information generated during proceeding our strategy.ASC can pick appropriate time to trigger collecting action and filter unnecessary tasks,meanwhile reasonably allocating collecting tasks to appropriate executing locations with right executing models through the information stored at ESI.We implement and evaluate our strategy in a distributed database.Experiments show that our solutions generally improve the efficiency and correctness of collecting statistics,moreover,reduce the negative effect to system performance comparing with other strategies.