Replication is an approach often used to speed up the execution of queries submitted to a large dataset.A compile-time/run-time approach is presented for minimizing the response time of 2-dimensional range when a dist...Replication is an approach often used to speed up the execution of queries submitted to a large dataset.A compile-time/run-time approach is presented for minimizing the response time of 2-dimensional range when a distributed replica of a dataset exists.The aim is to partition the query payload(and its range) into subsets and distribute those to the replica nodes in a way that minimizes a client's response time.However,since query size and distribution characteristics of data(data dense/sparse regions) in varying ranges are not known a priori,performing efficient load balancing and parallel processing over the unpredictable workload is difficult.A technique based on the creation and manipulation of dynamic spatial indexes for query payload estimation in distributed queries was proposed.The effectiveness of this technique was demonstrated on queries for analysis of archived earthquake-generated seismic data records.展开更多
The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is...The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is a combinatorial optimization problem,which renders exhaustive search impossible as query sizes rise.Increases in CPU performance have surpassed main memory,and disk access speeds in recent decades,allowing data compression to be used—strategies for improving database performance systems.For performance enhancement,compression and query optimization are the two most factors.Compression reduces the volume of data,whereas query optimization minimizes execution time.Compressing the database reduces memory requirement,data takes less time to load into memory,fewer buffer missing occur,and the size of intermediate results is more diminutive.This paper performed query optimization on the graph database in a cloud dew environment by considering,which requires less time to execute a query.The factors compression and query optimization improve the performance of the databases.This research compares the performance of MySQL and Neo4j databases in terms of memory usage and execution time running on cloud dew servers.展开更多
Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs ...Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs of executing alternative queries. The key aspect of the algorithm proposed here is that previous proposed SQO techniques can be considered equally in the uniform cost model, with which optimization opportunities will not be missed. At the same time, the authors used the implication closure to guarantee that any matched rule will not be lost. The authors implemented their algorithm for the optimization of decomposed sub-query in local database in Multi-Database Integrator (MDBI), which is a multidatabase project. The experimental results verify that this algorithm is effective in the process of SQO.展开更多
Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal ba...Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal based query execution (RDF-LTE) approach, this paper discusses how the execution order of the triple pattern affects the query results and cost based on concrete SPARQL queries, and analyzes two properties of the web of linked data, missing backward links and missing contingency solution. Then three heuristic principles for logic query plan optimization, namely, the filtered basic graph pattern (FBGP) principle, the triple pattern chain principle and the seed URIs principle, are proposed. The three principles contribute to decrease the intermediate solutions and increase the types of queries that can be answered. The effectiveness and feasibility of the proposed approach is evaluated. The experimental results show that more query results can be returned with less cost, thus enabling users to develop the full potential of the web of linked data.展开更多
As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization f...As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization for XML structured-documents stands out as one of themost challenging research issues in this area because of the much enlarged optimization (search)space, which is a consequence of the intrinsic complexity of the underlying data model of XML data.We therefore propose to apply deterministic transformations on query expressions to mostaggressively prune the search space and fast achieve a sufficiently improved alternative (if not theoptimal) for each incoming query expression. This idea is not just exciting but practicallyattainable. This paper first provides an overview of our optimization strategy, and then focuses onthe key implementation issues of our rule-based transformation system for XML query optimization ina database environment. The performance results we obtained from experimentation show that ourapproach is a valid and effective one.展开更多
Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartG...Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.展开更多
A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key...A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key issues to theefficiency of a DeDB, the compilation process is decomposed into two phases.The first is the pre-compilation phase, which is responsible for the minimiza-tion of the potentially relevant facts. The second, which we refer to as thegeneral compilation phase, is responsible for the elimination of redundancy.The rule/goal graph devised by J. D. Ullman is appropriately extended andused as a uniform formalism. Two general algorithms corresponding to the twophases respectively are described intuitively and formally展开更多
The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimizati...The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.展开更多
Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query pla...Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.展开更多
The rigid structure of the traditional relational database leads to data redundancy,which seriously affects the efficiency of the data query and cannot effectively manage massive data.To solve this problem,we use dist...The rigid structure of the traditional relational database leads to data redundancy,which seriously affects the efficiency of the data query and cannot effectively manage massive data.To solve this problem,we use distributed storage and parallel computing technology to query RDF data.In order to achieve efficient storage and retrieval of large-scale RDF data,we combine the respective advantage of the storage model of the relational database and the distributed query.To overcome the disadvantages of storing and querying RDF data,we design and implement a breadth-first path search algorithm based on the keyword query on a distributed platform.We conduct the LUBM query statements respectively with the selected data sets.In experiments,we compare query response time in different conditions to evaluate the feasibility and correctness of our approaches.The results show that the proposed scheme can reduce the storage cost and improve query efficiency.展开更多
Join-aggregate is an important and widely used operation in database system. However, it is time-consuming to process join-aggregate query in big data environment, especially on MapReduce framework. The main bottlenec...Join-aggregate is an important and widely used operation in database system. However, it is time-consuming to process join-aggregate query in big data environment, especially on MapReduce framework. The main bottlenecks contain two aspects: lots of I/O caused by temporary data and heavy communication overhead between different data nodes during query processing. To overcome such disadvantages, we design a data structure called Reference Primary Key table (RPK-table) which stores the relationship of primary key and foreign key between tables. Based on this structure, we propose an improved algorithm on MapReduce framework for join-aggregate query. Experi-ments on TPC-H dataset demonstrate that our algorithm outperforms existing methods in terms of communication cost and query response time.展开更多
Semantic Web has emerged to make web content machine-readable,and with the rapid increase in the number of web pages,its importance has increased.Resource description framework(RDF)is a special data graph format where...Semantic Web has emerged to make web content machine-readable,and with the rapid increase in the number of web pages,its importance has increased.Resource description framework(RDF)is a special data graph format where Semantic Web data are stored and it can be queried by SPARQL query language.The challenge is to find the optimal query order that results in the shortest period of time.In this paper,the discrete Artificial Bee Colony(dABCSPARQL)algorithm is proposed,based on a novel heuristic approach,namely reordering SPARQL queries.The processing time of queries with different shapes and sizes is minimized using the dABCSPARQL algorithm.The performance of the proposed method is evaluated on chain,star,cyclic,and chain-star queries of different sizes from the Lehigh University Benchmark(LUBM)dataset.The results obtained by the proposed method are compared with those of ARQ(a SPARQL processor for Jena)query engine,the Ant System,the Elitist Ant System,and MAX-MIN Ant System algorithms.The experiments demonstrate that the proposed method significantly reduces the processing time,and in most queries,the reduction rate is higher compared with other optimization methods.展开更多
Kubernetes is an open-source container management tool which automates container deployment,container load balancing and container(de)scaling,including Horizontal Pod Autoscaler(HPA),Vertical Pod Autoscaler(VPA).HPA e...Kubernetes is an open-source container management tool which automates container deployment,container load balancing and container(de)scaling,including Horizontal Pod Autoscaler(HPA),Vertical Pod Autoscaler(VPA).HPA enables flawless operation,interactively scaling the number of resource units,or pods,without downtime.Default Resource Metrics,such as CPU and memory use of host machines and pods,are monitored by Kubernetes.Cloud Computing has emerged as a platform for individuals beside the corporate sector.It provides cost-effective infrastructure,platform and software services in a shared environment.On the other hand,the emergence of industry 4.0 brought new challenges for the adaptability and infusion of cloud computing.As the global work environment is adapting constituents of industry 4.0 in terms of robotics,artificial intelligence and IoT devices,it is becoming eminent that one emerging challenge is collaborative schematics.Provision of such autonomous mechanism that can develop,manage and operationalize digital resources like CoBots to perform tasks in a distributed and collaborative cloud environment for optimized utilization of resources,ensuring schedule completion.Collaborative schematics are also linked with Bigdata management produced by large scale industry 4.0 setups.Different use cases and simulation results showed a significant improvement in Pod CPU utilization,latency,and throughput over Kubernetes environment.展开更多
Cloud computingmakes dynamic resource provisioning more accessible.Monitoring a functioning service is crucial,and changes are made when particular criteria are surpassed.This research explores the decentralized multi...Cloud computingmakes dynamic resource provisioning more accessible.Monitoring a functioning service is crucial,and changes are made when particular criteria are surpassed.This research explores the decentralized multi-cloud environment for allocating resources and ensuring the Quality of Service(QoS),estimating the required resources,and modifying allotted resources depending on workload and parallelism due to resources.Resource allocation is a complex challenge due to the versatile service providers and resource providers.The engagement of different service and resource providers needs a cooperation strategy for a sustainable quality of service.The objective of a coherent and rational resource allocation is to attain the quality of service.It also includes identifying critical parameters to develop a resource allocation mechanism.A framework is proposed based on the specified parameters to formulate a resource allocation process in a decentralized multi-cloud environment.The three main parameters of the proposed framework are data accessibility,optimization,and collaboration.Using an optimization technique,these three segments are further divided into subsets for resource allocation and long-term service quality.The CloudSim simulator has been used to validate the suggested framework.Several experiments have been conducted to find the best configurations suited for enhancing collaboration and resource allocation to achieve sustained QoS.The results support the suggested structure for a decentralized multi-cloud environment and the parameters that have been determined.展开更多
Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platf...Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platform and the process level of the offshore industry.Currently,qualitymanagement remains in the era of primary information,and there is a lack of effective tracking and recording of welding quality data.When welding defects are encountered,it is difficult to rapidly and accurately determine the root cause of the problem from various complexities and scattered quality data.In this paper,a composite welding quality traceability model for offshore platform block construction process is proposed,it contains the quality early-warning method based on long short-term memory and quality data backtracking query optimization algorithm.By fulfilling the training of the early-warning model and the implementation of the query optimization algorithm,the quality traceability model has the ability to assist enterprises in realizing the rapid identification and positioning of quality problems.Furthermore,the model and the quality traceability algorithm are checked by cases in actual working conditions.Verification analyses suggest that the proposed early-warningmodel for welding quality and the algorithmfor optimizing backtracking requests are effective and can be applied to the actual construction process.展开更多
We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel que...We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.展开更多
An extent join to compute path expressions containing parent-children andancestor-descendent operations and two path expression optimization rules, path-shortening andpath-complementing, are presented in this paper. P...An extent join to compute path expressions containing parent-children andancestor-descendent operations and two path expression optimization rules, path-shortening andpath-complementing, are presented in this paper. Path-shortening reduces the number of joins byshortening the path while path-complementing optimizes the path execution by using an equivalentcomplementary path expression to compute the original one. Experimental results show that thealgorithms proposed are more efficient than traditional algorithms.展开更多
Static optimization of logical queries is, in substance, to move selections down as far as possible in evaluating logical queries. This paper extends Ullman's RGG (Rule/Goal Graph) and introduces P- graph, with wh...Static optimization of logical queries is, in substance, to move selections down as far as possible in evaluating logical queries. This paper extends Ullman's RGG (Rule/Goal Graph) and introduces P- graph, with which a wide range of recursive logical queries can be statically optimized top-down and evaluated bottom-up, some of which are usually optimized by dynamic approaches. The paper also shows that for some logical queries the complexity of pushing selections down and computing bottom-up is related to the complexity of base relation in the queries.展开更多
We present the first efficient sound and complete algorithm (i.e., AOMSSQ) for optimizing multiple subspace skyline queries simultaneously in this paper. We first identify three performance problems of the na/ve app...We present the first efficient sound and complete algorithm (i.e., AOMSSQ) for optimizing multiple subspace skyline queries simultaneously in this paper. We first identify three performance problems of the na/ve approach (i.e., SUBSKY) which can be used in processing arbitrary single-subspace skyline query. Then we propose a cell-dominance computation algorithm (i.e., CDCA) to efficiently overcome the drawbacks of SUBSKY. Specially, a novel pruning technique is used in CDCA to dramatically decrease the query time. Finally, based on the CDCA algorithm and the share mechanism between subspaces, we present and discuss the AOMSSQ algorithm and prove it sound and complete. We also present detailed theoretical analyses and extensive experiments that demonstrate our algorithms are both efficient and effective.展开更多
Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine (DQPE). Query plan refinement generates potentially efficient distributed query...Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine (DQPE). Query plan refinement generates potentially efficient distributed query plan by reusable aggregate query shipping (RAQS) approach. The approach improves response time at the cost of pre-processing time. If the overheads could not be compensated by query results reusage, RAQS is no more favorable. Therefore a globM cost estimation model is employed to get proper operators: RR_Agg, R_Agg, or R_Scan. For the purpose of reusing results of queries with aggregate function in distributed query processing, a multi-level hybrid view caching (HVC) scheme is introduced. The scheme retains the advantages of partial match and aggregate query results caching. By our solution, evaluations with distributed TPC-H queries show significant improvement on average response time.展开更多
文摘Replication is an approach often used to speed up the execution of queries submitted to a large dataset.A compile-time/run-time approach is presented for minimizing the response time of 2-dimensional range when a distributed replica of a dataset exists.The aim is to partition the query payload(and its range) into subsets and distribute those to the replica nodes in a way that minimizes a client's response time.However,since query size and distribution characteristics of data(data dense/sparse regions) in varying ranges are not known a priori,performing efficient load balancing and parallel processing over the unpredictable workload is difficult.A technique based on the creation and manipulation of dynamic spatial indexes for query payload estimation in distributed queries was proposed.The effectiveness of this technique was demonstrated on queries for analysis of archived earthquake-generated seismic data records.
文摘The query optimizer uses cost-based optimization to create an execution plan with the least cost,which also consumes the least amount of resources.The challenge of query optimization for relational database systems is a combinatorial optimization problem,which renders exhaustive search impossible as query sizes rise.Increases in CPU performance have surpassed main memory,and disk access speeds in recent decades,allowing data compression to be used—strategies for improving database performance systems.For performance enhancement,compression and query optimization are the two most factors.Compression reduces the volume of data,whereas query optimization minimizes execution time.Compressing the database reduces memory requirement,data takes less time to load into memory,fewer buffer missing occur,and the size of intermediate results is more diminutive.This paper performed query optimization on the graph database in a cloud dew environment by considering,which requires less time to execute a query.The factors compression and query optimization improve the performance of the databases.This research compares the performance of MySQL and Neo4j databases in terms of memory usage and execution time running on cloud dew servers.
文摘Semantic query optimization (SQO) is comparatively a recent approach for the transformation of given query into equivalent alternative query using matching rules in order to select an optimal query based on the costs of executing alternative queries. The key aspect of the algorithm proposed here is that previous proposed SQO techniques can be considered equally in the uniform cost model, with which optimization opportunities will not be missed. At the same time, the authors used the implication closure to guarantee that any matched rule will not be lost. The authors implemented their algorithm for the optimization of decomposed sub-query in local database in Multi-Database Integrator (MDBI), which is a multidatabase project. The experimental results verify that this algorithm is effective in the process of SQO.
基金The National Natural Science Foundation of China(No.61070170)the Natural Science Foundation of Higher Education Institutions of Jiangsu Province(No.11KJB520017)Suzhou Application Foundation Research Project(No.SYG201238)
文摘Aiming at the problem that only some types of SPARQL ( simple protocal and resource description framework query language) queries can be answered by using the current resource description framework link traversal based query execution (RDF-LTE) approach, this paper discusses how the execution order of the triple pattern affects the query results and cost based on concrete SPARQL queries, and analyzes two properties of the web of linked data, missing backward links and missing contingency solution. Then three heuristic principles for logic query plan optimization, namely, the filtered basic graph pattern (FBGP) principle, the triple pattern chain principle and the seed URIs principle, are proposed. The three principles contribute to decrease the intermediate solutions and increase the types of queries that can be answered. The effectiveness and feasibility of the proposed approach is evaluated. The experimental results show that more query results can be returned with less cost, thus enabling users to develop the full potential of the web of linked data.
文摘As the popularity of XML (extensible Markup Language) keeps growing rapidly,the management of XML compliant structured-document databases has become a very interesting andcompelling research area. Query optimization for XML structured-documents stands out as one of themost challenging research issues in this area because of the much enlarged optimization (search)space, which is a consequence of the intrinsic complexity of the underlying data model of XML data.We therefore propose to apply deterministic transformations on query expressions to mostaggressively prune the search space and fast achieve a sufficiently improved alternative (if not theoptimal) for each incoming query expression. This idea is not just exciting but practicallyattainable. This paper first provides an overview of our optimization strategy, and then focuses onthe key implementation issues of our rule-based transformation system for XML query optimization ina database environment. The performance results we obtained from experimentation show that ourapproach is a valid and effective one.
文摘Fundamentally, semantic grid database is about bringing globally distributed databases together in order to coordinate resource sharing and problem solving in which information is given well-defined meaning, and DartGrid II is the implemented database gird system whose goal is to provide a semantic solution for integrating database resources on the Web. Although many algorithms have been proposed for optimizing query-processing in order to minimize costs and/or response time, associated with obtaining the answer to query in a distributed database system, database grid query optimization problem is fundamentally different from traditional distributed query optimization. These differences are shown to be the consequences of autonomy and heterogeneity of database nodes in database grid. Therefore, more challenges have arisen for query optimization in database grid than traditional distributed database. Following this observation, the design of a query optimizer in DartGrid II is presented, and a heuristic, dynamic and parallel query optimization approach to processing query in database grid is proposed. A set of semantic tools supporting relational database integration and semantic-based information browsing has also been implemented to realize the above vision.
文摘A systematic, efficient compilation method for query evaluation of DeductiveDatabases (DeDB) is proposed in this paper. In order to eliminate redundancyand to minimize the potentially relevant facts, which are two key issues to theefficiency of a DeDB, the compilation process is decomposed into two phases.The first is the pre-compilation phase, which is responsible for the minimiza-tion of the potentially relevant facts. The second, which we refer to as thegeneral compilation phase, is responsible for the elimination of redundancy.The rule/goal graph devised by J. D. Ullman is appropriately extended andused as a uniform formalism. Two general algorithms corresponding to the twophases respectively are described intuitively and formally
基金partially supported by NSFC under Grant Nos.61832001 and 62272008ZTE Industry-University-Institute Fund Project。
文摘The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.
基金The National High Technology Research and Development Program of China(863 Program) (No.2006AA01Z430)
文摘Through the mapping from UMQL ( unified multimedia query language) conditional expressions to UMQA (unified multimedia query algebra) query operations, a translation algorithm from a UMQL query to a UMQA query plan is put forward, which can generate an equivalent UMQA internal query plan for any UMQL query. Then, to improve the execution costs of UMQA query plans effectively, equivalent UMQA translation formulae and general optimization strategies are studied, and an optimization algorithm for UMQA internal query plans is presented. This algorithm uses equivalent UMQA translation formulae to optimize query plans, and makes the optimized query plans accord with the optimization strategies as much as possible. Finally, the logic implementation methods of UMQA plans, i.e., logic implementation methods of UMQA operators, are discussed to obtain useful target data from a muifirnedia database. All of these algorithms are implemented in a UMQL prototype system. Application results show that these query processing techniques are feasible and applicable.
基金This work is supported in part by National Natural Science Foundation of China(61728204)Innovation Funding(NJ20160028,NT2018027,NT2018028,NS2018057)+1 种基金Aeronautical Science Foundation of China(2016551500)State Key Laboratory for smart grid protection and operation control Foundation,Association of Chinese Graduate Education(ACGE).
文摘The rigid structure of the traditional relational database leads to data redundancy,which seriously affects the efficiency of the data query and cannot effectively manage massive data.To solve this problem,we use distributed storage and parallel computing technology to query RDF data.In order to achieve efficient storage and retrieval of large-scale RDF data,we combine the respective advantage of the storage model of the relational database and the distributed query.To overcome the disadvantages of storing and querying RDF data,we design and implement a breadth-first path search algorithm based on the keyword query on a distributed platform.We conduct the LUBM query statements respectively with the selected data sets.In experiments,we compare query response time in different conditions to evaluate the feasibility and correctness of our approaches.The results show that the proposed scheme can reduce the storage cost and improve query efficiency.
文摘Join-aggregate is an important and widely used operation in database system. However, it is time-consuming to process join-aggregate query in big data environment, especially on MapReduce framework. The main bottlenecks contain two aspects: lots of I/O caused by temporary data and heavy communication overhead between different data nodes during query processing. To overcome such disadvantages, we design a data structure called Reference Primary Key table (RPK-table) which stores the relationship of primary key and foreign key between tables. Based on this structure, we propose an improved algorithm on MapReduce framework for join-aggregate query. Experi-ments on TPC-H dataset demonstrate that our algorithm outperforms existing methods in terms of communication cost and query response time.
文摘Semantic Web has emerged to make web content machine-readable,and with the rapid increase in the number of web pages,its importance has increased.Resource description framework(RDF)is a special data graph format where Semantic Web data are stored and it can be queried by SPARQL query language.The challenge is to find the optimal query order that results in the shortest period of time.In this paper,the discrete Artificial Bee Colony(dABCSPARQL)algorithm is proposed,based on a novel heuristic approach,namely reordering SPARQL queries.The processing time of queries with different shapes and sizes is minimized using the dABCSPARQL algorithm.The performance of the proposed method is evaluated on chain,star,cyclic,and chain-star queries of different sizes from the Lehigh University Benchmark(LUBM)dataset.The results obtained by the proposed method are compared with those of ARQ(a SPARQL processor for Jena)query engine,the Ant System,the Elitist Ant System,and MAX-MIN Ant System algorithms.The experiments demonstrate that the proposed method significantly reduces the processing time,and in most queries,the reduction rate is higher compared with other optimization methods.
文摘Kubernetes is an open-source container management tool which automates container deployment,container load balancing and container(de)scaling,including Horizontal Pod Autoscaler(HPA),Vertical Pod Autoscaler(VPA).HPA enables flawless operation,interactively scaling the number of resource units,or pods,without downtime.Default Resource Metrics,such as CPU and memory use of host machines and pods,are monitored by Kubernetes.Cloud Computing has emerged as a platform for individuals beside the corporate sector.It provides cost-effective infrastructure,platform and software services in a shared environment.On the other hand,the emergence of industry 4.0 brought new challenges for the adaptability and infusion of cloud computing.As the global work environment is adapting constituents of industry 4.0 in terms of robotics,artificial intelligence and IoT devices,it is becoming eminent that one emerging challenge is collaborative schematics.Provision of such autonomous mechanism that can develop,manage and operationalize digital resources like CoBots to perform tasks in a distributed and collaborative cloud environment for optimized utilization of resources,ensuring schedule completion.Collaborative schematics are also linked with Bigdata management produced by large scale industry 4.0 setups.Different use cases and simulation results showed a significant improvement in Pod CPU utilization,latency,and throughput over Kubernetes environment.
文摘Cloud computingmakes dynamic resource provisioning more accessible.Monitoring a functioning service is crucial,and changes are made when particular criteria are surpassed.This research explores the decentralized multi-cloud environment for allocating resources and ensuring the Quality of Service(QoS),estimating the required resources,and modifying allotted resources depending on workload and parallelism due to resources.Resource allocation is a complex challenge due to the versatile service providers and resource providers.The engagement of different service and resource providers needs a cooperation strategy for a sustainable quality of service.The objective of a coherent and rational resource allocation is to attain the quality of service.It also includes identifying critical parameters to develop a resource allocation mechanism.A framework is proposed based on the specified parameters to formulate a resource allocation process in a decentralized multi-cloud environment.The three main parameters of the proposed framework are data accessibility,optimization,and collaboration.Using an optimization technique,these three segments are further divided into subsets for resource allocation and long-term service quality.The CloudSim simulator has been used to validate the suggested framework.Several experiments have been conducted to find the best configurations suited for enhancing collaboration and resource allocation to achieve sustained QoS.The results support the suggested structure for a decentralized multi-cloud environment and the parameters that have been determined.
基金funded by Ministry of Industry and Information Technology of the People’s Republic of China[Grant No.2018473].
文摘Quality traceability plays an essential role in assembling and welding offshore platform blocks.The improvement of the welding quality traceability system is conducive to improving the durability of the offshore platform and the process level of the offshore industry.Currently,qualitymanagement remains in the era of primary information,and there is a lack of effective tracking and recording of welding quality data.When welding defects are encountered,it is difficult to rapidly and accurately determine the root cause of the problem from various complexities and scattered quality data.In this paper,a composite welding quality traceability model for offshore platform block construction process is proposed,it contains the quality early-warning method based on long short-term memory and quality data backtracking query optimization algorithm.By fulfilling the training of the early-warning model and the implementation of the query optimization algorithm,the quality traceability model has the ability to assist enterprises in realizing the rapid identification and positioning of quality problems.Furthermore,the model and the quality traceability algorithm are checked by cases in actual working conditions.Verification analyses suggest that the proposed early-warningmodel for welding quality and the algorithmfor optimizing backtracking requests are effective and can be applied to the actual construction process.
文摘We developed a parallel object relational DBMS named PORLES. It uses BSP model as its parallel computing model, and monoid calculus as its basis of data model. In this paper, we introduce its data model, parallel query optimization, transaction processing system and parallel access method in detail.
文摘An extent join to compute path expressions containing parent-children andancestor-descendent operations and two path expression optimization rules, path-shortening andpath-complementing, are presented in this paper. Path-shortening reduces the number of joins byshortening the path while path-complementing optimizes the path execution by using an equivalentcomplementary path expression to compute the original one. Experimental results show that thealgorithms proposed are more efficient than traditional algorithms.
文摘Static optimization of logical queries is, in substance, to move selections down as far as possible in evaluating logical queries. This paper extends Ullman's RGG (Rule/Goal Graph) and introduces P- graph, with which a wide range of recursive logical queries can be statically optimized top-down and evaluated bottom-up, some of which are usually optimized by dynamic approaches. The paper also shows that for some logical queries the complexity of pushing selections down and computing bottom-up is related to the complexity of base relation in the queries.
基金This work is supported by the NSF of USA under Grant No.IIS-0308001the National Natural Science Foundation of China under Grant No.60303008the National Grand Fundamental Research 973 Program of China under Grant No.2005CB321905.
文摘We present the first efficient sound and complete algorithm (i.e., AOMSSQ) for optimizing multiple subspace skyline queries simultaneously in this paper. We first identify three performance problems of the na/ve approach (i.e., SUBSKY) which can be used in processing arbitrary single-subspace skyline query. Then we propose a cell-dominance computation algorithm (i.e., CDCA) to efficiently overcome the drawbacks of SUBSKY. Specially, a novel pruning technique is used in CDCA to dramatically decrease the query time. Finally, based on the CDCA algorithm and the share mechanism between subspaces, we present and discuss the AOMSSQ algorithm and prove it sound and complete. We also present detailed theoretical analyses and extensive experiments that demonstrate our algorithms are both efficient and effective.
基金partially supported by the National Basic Research 973 Program of China under Grant No. 2005CB321807the National High Technology Rresearch and Development 863 Program of China under Grant Nos. 2006AA01A106 and 2006AA04Z158.
文摘Our study introduces a novel distributed query plan refinement phase in an enhanced architecture of distributed query processing engine (DQPE). Query plan refinement generates potentially efficient distributed query plan by reusable aggregate query shipping (RAQS) approach. The approach improves response time at the cost of pre-processing time. If the overheads could not be compensated by query results reusage, RAQS is no more favorable. Therefore a globM cost estimation model is employed to get proper operators: RR_Agg, R_Agg, or R_Scan. For the purpose of reusing results of queries with aggregate function in distributed query processing, a multi-level hybrid view caching (HVC) scheme is introduced. The scheme retains the advantages of partial match and aggregate query results caching. By our solution, evaluations with distributed TPC-H queries show significant improvement on average response time.