期刊文献+

基于节点相似度的容错网格作业调度算法研究 被引量:2

Research on fault-tolerant grid task scheduling algorithms based on node similarity
下载PDF
导出
摘要 为提高网格作业运行的成功率,研究了提高作业调度的可靠性的方法。研究表明,现有容错的网格作业调度算法多通过作业复制来降低节点的软硬件故障造成的作业失败的概率,它们既没有考虑作业的多个副本因共处的网络环境故障造成的同时失败,也没有考虑作业的多个副本由于所在节点缺乏相同的资源而同时失败。针对这一问题,提出了节点相似度的概念和计算方法,并将其应用到容错的网格调度算法中。提出的容错的调度算法将作业的多个副本分配到具有不同相似度的节点上运行,充分利用网格环境的分布性和异构性进一步减小作业失败的概率。 The paper investigates the grid task scheduling with the aim of decreasing the failure of grid tasks and points out that task replication is the common mechanism of most existing fault-tolerant grid scheduling algorithms. Those algorithms ignore that most replicas of the same task will fail if their network environments crash or the assigned grid nodes lack the same necessary resources. To mitigate this problem, the concept of node similarity is proposed and it is applied to a faulttolerant grid task scheduling algorithm. The proposed algorithm tries to assign the replicas of the same task to grid nodes which have less similarity and makes full use of the distributed and heterogeneous nature of grids to further decrease the failure of grid tasks.
出处 《高技术通讯》 EI CAS CSCD 北大核心 2008年第12期1224-1230,共7页 Chinese High Technology Letters
基金 973计划(G2005CB321806)资助项目
关键词 网格 作业调度 容错 节点相似度 grid, task scheduling, fault-tolerant, node similarity
  • 相关文献

参考文献14

  • 1Foster I, Kesselman C, Tsudik G, et al. A Security Architecture for Computational Grids. In: Proceedings of the 5th ACM Conference on Computer and Communications Security Conference, San Francisco, California, USA, 1998. 83-92
  • 2怀进鹏,胡春明,李建欣,孙海龙,沃天宇.CROWN:面向服务的网格中间件系统与信任管理[J].中国科学(E辑),2006,36(10):1127-1155. 被引量:6
  • 3王怀民,唐扬斌,尹刚,李磊.互联网软件的可信机理[J].中国科学(E辑),2006,36(10):1156-1169. 被引量:59
  • 4Zou D Q, Jin H, Chen H H, et al. Fault-tolerant grid architecture and practice. Journal of Computer Science and Technology, 2003, 18(4) :423-433
  • 5Azzedin F, Msheswaran M. Integrating trust into grid resource management systems. In: Proceedings of the 2002 International Conference on Parallel Processing, Vancouver, British Columbia, Canada, 2002. 47-54
  • 6Song S, Kwok Y K, Hwang K. Trusted job scheduling in open computational grids: security-driven heuristics and a fast genetic algorithm. In: Proceedings of the 19th IEEE International Parallel & Distributed Processing Symposium, Denver, CO, USA, 2005.33-40
  • 7Li K, He Y, Liu X. Security-driven scheduling algorithms based on eigentrust in grid. In: Proceedings of the 6th International Conference of Parallel and Distributed Computing Applications and Technologies, Denver, USA, 2005. 1068- 1072
  • 8金海,陈刚,赵美平.容错计算网格作业调度模型的研究[J].计算机研究与发展,2004,41(8):1382-1388. 被引量:14
  • 9王树鹏,云晓春,余翔湛.基于生存性和Makespan的多目标网格任务调度算法研究[J].通信学报,2006,27(2):42-49. 被引量:16
  • 10Braun T D, Siegel H J, Beck N, et al. A comparison study of static mapping heuristics for a class of metatasks on heterogeneous computing systems. Journal of Parallel and Distributed Computing, 2001, 61(6) :810-837

二级参考文献116

  • 1胡春明,怀进鹏,孙海龙.基于Web服务的网格体系结构及其支撑环境研究[J].软件学报,2004,15(7):1064-1073. 被引量:84
  • 2林闯,彭雪海.可信网络研究[J].计算机学报,2005,28(5):751-758. 被引量:253
  • 3闵应骅.容错计算二十五年[J].计算机学报,1995,18(12):930-943. 被引量:16
  • 4刘云生,张传富,张童,查亚兵,黄柯棣.基于Markov链的分布式仿真系统最佳检查点间隔研究[J].国防科技大学学报,2005,27(5):73-77. 被引量:9
  • 5李建欣,怀进鹏,李先贤.自动信任协商研究[J].软件学报,2006,17(1):124-133. 被引量:52
  • 6I Foster, C Kesselman. The Grid: Blueprint for a Future Computing Infrastructure. San Francisco, California: Morgan Kaufmann Publishers, 1999
  • 7K Czajkowski, I Foster, N Karonis, et al. A resource management architecture for metacomputing systems. IPPS/SPDP' 98 Workshop on Job Scheduling Strategies for Parallel Processing, Orlando, Florida, USA, 1998
  • 8Deqing Zou, Hai Jin, Hanhua Chen, et al. Fault-tolerant grid architecture and practice. Journal of Computer Science and Technology, 2003, 18(4): 423~433
  • 9K Geunmo, Y Hyunsoo. On submesh allocation for mesh multicomputers: A best fit allocation and a virtual submesh allocation for faulty meshes. IEEE Trans on Parallel and Distributed Systems, 1998, 9(2) : 175~ 185
  • 10G Allen, T Dramlitsch, I Foster, et al. Supporting efficient execution in heterogeneous distributed computing environments with cactus and globus. In: Supercomputing 2001. New York:ACM Press, 2001

共引文献98

同被引文献4

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部