The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimizati...The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.展开更多
Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed wo...Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.展开更多
传统网络环境和P2P环境中,客户端向OLAP服务器提交OLAP查询,并从服务器获取查询结果,OLAP服务器的负载将随着客户端的增加而急剧增加。设计了一种基于P2P(Peer-to-Peer,点对点技术)技术的DQDC(Distributed Query Data Cube,多维数据集...传统网络环境和P2P环境中,客户端向OLAP服务器提交OLAP查询,并从服务器获取查询结果,OLAP服务器的负载将随着客户端的增加而急剧增加。设计了一种基于P2P(Peer-to-Peer,点对点技术)技术的DQDC(Distributed Query Data Cube,多维数据集的分布式查询)算法,实现P2P网络中语义级的多节点Data Cube数据共享,从而提高系统整体的决策分析性能。展开更多
基金partially supported by NSFC under Grant Nos.61832001 and 62272008ZTE Industry-University-Institute Fund Project。
文摘The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.
文摘Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.
文摘动态路网k近邻(kNN)查询是许多基于位置的服务(LBS)中的一个重要问题。针对该问题,提出一种面向动态路网的移动对象分布式kNN查询算法DkNN(Distributed kNN)。首先,将整个路网划分为部署于集群中不同节点中的多个子图;其次,通过并行地搜索查询范围所涉及的子图得到精确的kNN结果;最后,优化查询的搜索过程,引入查询范围剪枝策略和查询终止策略。在4个道路网络数据集上与3种基线算法进行了充分对比和验证。实验结果显示,与TEN~*-Index(Tree dEcomposition based kNN~*Index)算法相比,DkNN算法的查询时间减少了56.8%,路网更新时间降低了3个数量级。DkNN算法可以快速响应动态路网中的kNN查询请求,且在处理路网更新时具有较低的更新成本。
文摘传统网络环境和P2P环境中,客户端向OLAP服务器提交OLAP查询,并从服务器获取查询结果,OLAP服务器的负载将随着客户端的增加而急剧增加。设计了一种基于P2P(Peer-to-Peer,点对点技术)技术的DQDC(Distributed Query Data Cube,多维数据集的分布式查询)算法,实现P2P网络中语义级的多节点Data Cube数据共享,从而提高系统整体的决策分析性能。