Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has beco...Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has become the key issue. In this paper, first, we build a directed hypergraph model for data-intensive workflow, since Hypergraphs can more accurately model communication volume and better represent asymmetric problems, and the cut metric of hypergraphs is well suited for minimizing the total volume of communication.Second, we propose a concept data supportive ability to help the presentation of data-intensive workflow application and provide the merge operation details considering the data supportive ability. Third, we present an optimized hypergraph multi-level partitioning algorithm. Finally we bring a data reduced scheduling policy HEFT-P for data-intensive workflow. Through simulation,we compare HEFT-P with three typical workflow scheduling policies.The results indicate that HEFT-P could obtain reduced data scheduling and reduce the makespan of executing data-intensive展开更多
With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an e...With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an energy-aware scheduling algorithm is proposed, which introduces a novel heuristic called Minimal Data-Accessing Energy Path for scheduling data-intensive workflows aiming to reduce the energy consumption of intensive data accessing. Extensive experiments based on both synthetical and real workloads are conducted to investigate the effectiveness and performance of the proposed scheduling approach. The experimental results show that the proposed heuristic scheduling can significantly reduce the energy consumption of storing/retrieving intermediate data generated during the execution of data-intensive workflow. In addition, it exhibits better robustness than existing algorithms when cloud systems are in presence of I/O- intensive workloads.展开更多
With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependenc...With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependencies, resulting in the inflexibility of the design and implement for the processes. This paper proposes a novel data-aware business process model which is able to describe both explicit control flow and implicit data flow. Data model with dependencies which are formulated by Linear-time Temporal Logic(LTL) is presented, and their satisfiability is validated by an automaton-based model checking algorithm. Data dependencies are fully considered in modeling phase, which helps to improve the efficiency and reliability of programming during developing phase. Finally, a prototype system based on j BPM for data-aware workflow is designed using such model, and has been deployed to Beijing Kingfore heating management system to validate the flexibility, efficacy and convenience of our approach for massive coding and large-scale system management in reality.展开更多
In a cloud-native era,the Kubernetes-based workflow engine enables workflow containerized execution through the inherent abilities of Kubernetes.However,when encountering continuous workflow requests and unexpected re...In a cloud-native era,the Kubernetes-based workflow engine enables workflow containerized execution through the inherent abilities of Kubernetes.However,when encountering continuous workflow requests and unexpected resource request spikes,the engine is limited to the current workflow load information for resource allocation,which lacks the agility and predictability of resource allocation,resulting in over and underprovisioning resources.This mechanism seriously hinders workflow execution efficiency and leads to high resource waste.To overcome these drawbacks,we propose an adaptive resource allocation scheme named adaptive resource allocation scheme(ARAS)for the Kubernetes-based workflow engines.Considering potential future workflow task requests within the current task pod’s lifecycle,the ARAS uses a resource scaling strategy to allocate resources in response to high-concurrency workflow scenarios.The ARAS offers resource discovery,resource evaluation,and allocation functionalities and serves as a key component for our tailored workflow engine(KubeAdaptor).By integrating the ARAS into KubeAdaptor for workflow containerized execution,we demonstrate the practical abilities of KubeAdaptor and the advantages of our ARAS.Compared with the baseline algorithm,experimental evaluation under three distinct workflow arrival patterns shows that ARAS gains time-saving of 9.8% to 40.92% in the average total duration of all workflows,time-saving of 26.4% to 79.86% in the average duration of individual workflow,and an increase of 1% to 16% in centrol processing unit(CPU)and memory resource usage rate.展开更多
Aiming to meet the growing demand for observation and analysis in power systems that based on Internet of Things(IoT),machine learning technology has been adopted to deal with the data-intensive power electronics appl...Aiming to meet the growing demand for observation and analysis in power systems that based on Internet of Things(IoT),machine learning technology has been adopted to deal with the data-intensive power electronics applications in IoT.By feeding previous power electronic data into the learning model,accurate information is drawn,and the quality of IoT-based power services is improved.Generally,the data-intensive electronic applications with machine learning are split into numerous data/control constrained tasks by workflow technology.The efficient execution of this data-intensive Power Workflow(PW)needs massive computing resources,which are available in the cloud infrastructure.Nevertheless,the execution efficiency of PW decreases due to inappropriate sub-task and data placement.In addition,the power consumption explodes due to massive data acquisition.To address these challenges,a PW placement method named PWP is devised.Specifically,the Non-dominated Sorting Differential Evolution(NSDE)is used to generate placement strategies.The simulation experiments show that PWP achieves the best trade-off among data acquisition time,power consumption,load distribution and privacy preservation,confirming that PWP is effective for the placement problem.展开更多
In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our ...In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our model, we illustrate how to verifysome of its properties (reachability, safety, boundedness, liveness, correctness, alive tokens, and security) in the P-Timed context. Next, we validate the relevance of the proposed model with MATLAB simulation through a specific FMS case study. Finally, we use a generalized truncated density function to predict the duration of a token’s sojourn (residence) in a timed place with respect to the sequence states of the global FMS workflow.展开更多
文摘Data-intensive computing is expected to be the next-generation IT computing paradigm. Data-intensive workflows in clouds are becoming more and more popular. How to schedule data-intensive workflow efficiently has become the key issue. In this paper, first, we build a directed hypergraph model for data-intensive workflow, since Hypergraphs can more accurately model communication volume and better represent asymmetric problems, and the cut metric of hypergraphs is well suited for minimizing the total volume of communication.Second, we propose a concept data supportive ability to help the presentation of data-intensive workflow application and provide the merge operation details considering the data supportive ability. Third, we present an optimized hypergraph multi-level partitioning algorithm. Finally we bring a data reduced scheduling policy HEFT-P for data-intensive workflow. Through simulation,we compare HEFT-P with three typical workflow scheduling policies.The results indicate that HEFT-P could obtain reduced data scheduling and reduce the makespan of executing data-intensive
基金Supported by the National Natural Science Foundation of China under Grant Nos.60970038,61272148the Science and Technology Plan Project of Hunan Province of China under Grant No.2012GK3075the Scientific Research Fund of Hunan Provincial Education Department of China under Grant No.13B015
文摘With the development of cloud computing, more and more data-intensive workflows have been deployed on virtualized datacenters. As a result, the energy spent on massive data accessing grows rapidly. In this paper, an energy-aware scheduling algorithm is proposed, which introduces a novel heuristic called Minimal Data-Accessing Energy Path for scheduling data-intensive workflows aiming to reduce the energy consumption of intensive data accessing. Extensive experiments based on both synthetical and real workloads are conducted to investigate the effectiveness and performance of the proposed scheduling approach. The experimental results show that the proposed heuristic scheduling can significantly reduce the energy consumption of storing/retrieving intermediate data generated during the execution of data-intensive workflow. In addition, it exhibits better robustness than existing algorithms when cloud systems are in presence of I/O- intensive workloads.
基金supported by the National Natural Science Foundation of China (No. 61502043, No. 61132001)Beijing Natural Science Foundation (No. 4162042)BeiJing Talents Fund (No. 2015000020124G082)
文摘With the growing popularity of data-intensive services on the Internet, the traditional process-centric model for business process meets challenges due to the lack of abilities to describe data semantics and dependencies, resulting in the inflexibility of the design and implement for the processes. This paper proposes a novel data-aware business process model which is able to describe both explicit control flow and implicit data flow. Data model with dependencies which are formulated by Linear-time Temporal Logic(LTL) is presented, and their satisfiability is validated by an automaton-based model checking algorithm. Data dependencies are fully considered in modeling phase, which helps to improve the efficiency and reliability of programming during developing phase. Finally, a prototype system based on j BPM for data-aware workflow is designed using such model, and has been deployed to Beijing Kingfore heating management system to validate the flexibility, efficacy and convenience of our approach for massive coding and large-scale system management in reality.
基金supported by the National Natural Science Foundation of China(61873030,62002019).
文摘In a cloud-native era,the Kubernetes-based workflow engine enables workflow containerized execution through the inherent abilities of Kubernetes.However,when encountering continuous workflow requests and unexpected resource request spikes,the engine is limited to the current workflow load information for resource allocation,which lacks the agility and predictability of resource allocation,resulting in over and underprovisioning resources.This mechanism seriously hinders workflow execution efficiency and leads to high resource waste.To overcome these drawbacks,we propose an adaptive resource allocation scheme named adaptive resource allocation scheme(ARAS)for the Kubernetes-based workflow engines.Considering potential future workflow task requests within the current task pod’s lifecycle,the ARAS uses a resource scaling strategy to allocate resources in response to high-concurrency workflow scenarios.The ARAS offers resource discovery,resource evaluation,and allocation functionalities and serves as a key component for our tailored workflow engine(KubeAdaptor).By integrating the ARAS into KubeAdaptor for workflow containerized execution,we demonstrate the practical abilities of KubeAdaptor and the advantages of our ARAS.Compared with the baseline algorithm,experimental evaluation under three distinct workflow arrival patterns shows that ARAS gains time-saving of 9.8% to 40.92% in the average total duration of all workflows,time-saving of 26.4% to 79.86% in the average duration of individual workflow,and an increase of 1% to 16% in centrol processing unit(CPU)and memory resource usage rate.
基金supported by the Financial and Science Technology Plan Project of Xinjiang Production and Construction Corps under grant no.2020DB005 and no.2017DB005.
文摘Aiming to meet the growing demand for observation and analysis in power systems that based on Internet of Things(IoT),machine learning technology has been adopted to deal with the data-intensive power electronics applications in IoT.By feeding previous power electronic data into the learning model,accurate information is drawn,and the quality of IoT-based power services is improved.Generally,the data-intensive electronic applications with machine learning are split into numerous data/control constrained tasks by workflow technology.The efficient execution of this data-intensive Power Workflow(PW)needs massive computing resources,which are available in the cloud infrastructure.Nevertheless,the execution efficiency of PW decreases due to inappropriate sub-task and data placement.In addition,the power consumption explodes due to massive data acquisition.To address these challenges,a PW placement method named PWP is devised.Specifically,the Non-dominated Sorting Differential Evolution(NSDE)is used to generate placement strategies.The simulation experiments show that PWP achieves the best trade-off among data acquisition time,power consumption,load distribution and privacy preservation,confirming that PWP is effective for the placement problem.
文摘In this paper, we propose astochastic Petri net model P-timed Workflow (WPTSPN) to specify, verify, and analyze a business process (BP) of a Flexible Manufacturing System (FMS). After formalizing the semantics of our model, we illustrate how to verifysome of its properties (reachability, safety, boundedness, liveness, correctness, alive tokens, and security) in the P-Timed context. Next, we validate the relevance of the proposed model with MATLAB simulation through a specific FMS case study. Finally, we use a generalized truncated density function to predict the duration of a token’s sojourn (residence) in a timed place with respect to the sequence states of the global FMS workflow.