摘要
云计算这种新兴计算模型的出现为数据密集型工作流提供了良好的运行环境,云环境下数据密集型工作流的合理调度是其高效运行的关键,而调度方法的优劣往往以完成时间作为一个重要指标。现有调度研究大多基于各个任务的执行时间和时序逻辑计算完成时间,而对于云环境下数据密集型工作流的调度而言,任务之间的传输时间对完成时间的计算也非常重要。为此,提出了一种云环境下基于阶段划分的数据密集型工作流调度方法,首先基于任务的数据依赖将工作流划分为多个阶段,再根据预估的任务执行时间和任务之间的数据传输时间通过分配算法逐阶段完成任务的调度,最后通过具体实验演示了该方法的可行性和有效性。
The cloud computing,as an emerging computing model,provides a good operating environment for data-intensive workflows.The reasonable scheduling of data-intensive workflows in cloud environments is the key to its efficient operation,the performance of scheduling methods often take completion time as an important indicator.Most of the existing scheduling research calculate the completion time based on the execution time and the sequential logic of each task.For the scheduling of data-intensive workflows in the cloud environment,the transmission time between tasks is also important to the completion time.Based on it,a data-intensive workflow scheduling method based on stage division in cloud environments is proposed.Firstly,the workflow is divided into several stages based on data dependence of tasks,and then according to the estimated task execution time and data transmission time between tasks,the task scheduling is completed stage by stage by the allocation algorithm.Finally,the feasibility and the effectiveness of the method are demonstrated by specific experiments.
作者
陈俊宇
刘茜萍
CHEN Junyu;LIU Xiping(School of Computer Science,Nanjing University of Posts and Telecommunications,Nanjing 210023,China;Jiangsu Key Laboratory of Big Data Security&Intelligent Processing,Nanjing University of Posts and Telecommunications,Nanjing 210023,China)
出处
《南京邮电大学学报(自然科学版)》
北大核心
2020年第4期103-110,共8页
Journal of Nanjing University of Posts and Telecommunications:Natural Science Edition
基金
国家自然科学基金(61602260)资助项目。
关键词
云环境
数据密集型
工作流调度
完成时间
cloud environment
data intensive
workflow scheduling
completion time