一种主动式的网格工作流可靠性保障方法被引量：1

A Proactive Approach to Ensure Dependability of Grid Workflows

下载PDF

导出

摘要针对中国国家网格(CNGrid)环境,曾尝试通过一种工作流元调度机制(VINCA抽象工作流),在为用户提供单一入口和屏蔽细节的同时,优化利用已有的流程引擎能力。在此基础上提出一种主动式的工作流可靠性保障方法,根据流程引擎在最近一段时间间隔内的失效率和负载增长率两种特征参数主动预测其将来成功处理请求的概率,并据此将VINCA抽象工作流中的复合活动(实现为一个子流程)调度到"最可靠"的工作流引擎上。文章旨在从整体上提高工作流执行的成功率和稳定性,有效地避免基于"事后"被动恢复模式所带来的时间开销和实施上的复杂性。最后,通过场景示例作出了定性分析,表明该方法在大规模持续执行流程时,能充分利用工作流引擎能力,有效地保证工作流执行的可靠性。 As a special kind of ＂programming＂ technology for constructing problem-solving applications on the basis of grid resources, grid workflow has attracted attention and made progress. However, how to ensure dependability of grid workflows is still a remaining challenge. For China National Grid （ CNGrid）, a meta-scheduling mechanism （VINCA Abstract Workflow） was proposed, by which the underling capability of workflow engines can be optimally used and a single entrance can be provided while details are hidden. A proactive approach to ensure dependability of grid workflows is proposed in this paper. The workflow engine＇s failure rate and workload increasing rate in a certain recent interval are calculated for predicting the probability of successfully handling current execution request. The compound activity （implemented as a sub-process） in a VINCA abstract workflow is scheduled to the most promising engine. In this way, the overall dependability of workflow execution can be enhanced while avoiding time cost and technical complexity caused by the traditional ＂reactive＂ rescue approaches. The approach is qualitatively analyzed by an example scenario, which indicates that the approach can ensure the dependability of workflow by fully utilizing the engines＇ capability when executing workflows cosmically and continuously.

作者张利永韩燕波

机构地区中国科学院计算技术研究所网格与服务计算研究中心中国科学院研究生院

出处《中山大学学报（自然科学版）》 CAS CSCD 北大核心 2008年第6期93-99,共7页 Acta Scientiarum Naturalium Universitatis Sunyatseni

基金国家“863”高科技研究发展计划基金资助项目(2006AA12Z202) 国家自然科学基金资助项目(70673098) 中国科学院奥运科技基金资助项目(KACX1-03)

关键词网格工作流可靠性保障方法 grid workflow dependability ensuring approaches

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献14

1YU J, BUYYA R. A taxonomy of workflow management systems for Grid computing [J]. Journal of Grid Computing, 2005, 3 (3) : 171 -200.
2Taverna User Manual. http://www. mygrid. org. uk/usermanuall. 7 [Z] , 2008.
3AN W, FONG L, BOBROFF N. BPEIAJob: A fault- handling design for job flow management [ C ]//Proceedings of the 5th International Conference on Service Oriented Computing ( ICSOC' 07 ). Vienna, Austria, 2007 : 27 - 42.
4COUVARES P, KOSAR T, ROY A, et al. Workflow management in condor [ M ]. Workflows for e-Science. 2007 : 357 - 375.
5ABAWAJY J H. Fault-tolerant scheduling policy for grid computing systems[ C]//Proceedings of the 18th International Parallel and Distributed Processing Symposium ( IPDPS'04). Santa Fe, New Mexico, USA, 2004:238 - 244.
6HWGNG S, KESSELMAN C. Grid Workflow: A flexible failure handling framework for the grid [ C ]//Proceedings of the 12th IEEE International Symposium on high Performance Distributed Computing ( HPDC' 03 ). Seattle, Washington, USA, 2003 : 126 - 137.
7ZHANG L Y, ZHAO Z F, LI H F. Leveraging legacy workflow capabilities in a Grid environment [ C ]//Proceedings of the 6th International Conference on Grid and Cooperative Computing ( GCC' 07), Urumchi, Xinjiang, China, 2007 : 361 -365.
8TOWNEND P, XU J. Dependability in Grids [ J ]. IEEE Distributed Systems Online, 2005, 6(12) :1 -1
9ANDREWS T, CURBERA F, DHOLAKIA H, et al. Business process execution language for web services version 1.1 [M/OL]. http ://www. ibm. com/developer- works/library/specification/ws-bpel, 2005
10ANJOMSHOAA A, BRISARD F, DRESCHER M, et al. Job submission description language (JSDL) specific, Version 1.0 [Z] ,2005.

同被引文献11

1Taverna User Manual[EB/OL]. http://www. mygrid. org. uk/ usermanual1. 7/, 2008.
2Couvares P, Kosar T, Roy A, et al. Workflow Management in Condor[J]. Workflows for e-Science, 2007 : 357-375.
3Abawajy J H. Fault-Tolerant Scheduling Policy for Grid Computing Systems[C] // 18th International Parallel and Distributed Processing Symposium (IPDPS' 04). Santa Fe, New Mexico, USA, April 2004: 238-244.
4Zhang L Y, Zhao Z F, Li H F. Leveraging Legacy Workflow Capabilities in a Grid Environment[C]//Sixth International Conference on Grid and Cooperative Computing (GCC 2007 ). Urumchi, Xinjiang, China, 2007 :361-365.
5Sulistio A,Cibej U,Venugopal S,et al. A Toolkit for Modelling and Simulating Data Grids:An Extension to GridSim[C]//Concurrency and Computation: Practice and Experience (CCPE). New York,USA :Wiley Press,Dec. 2007.
6Bell W H, Cameron D G,Capozza L, et al. OptorSim - A Grid Simulator for Studying Dynamic Data Replication Strategies[J]. International Journal of High Performance Computing Applications, 2003.
7Casanova H. SimGrid: A Toolkit for the Simulation of Application Scheduling[C]//Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid.2001:430-437.
8Song H J, Liu Xin, J akobsen D, et al. The MicroGrid: a Scientific Tool for Modeling Computational Grids[C]//Proceedings of Super Computing 2000. 2000.
9Caminero A, Sulistio A, Caminero B. Extending GridSim with an Architecture for Failure Detection.
10Tan W, Fong L, Bobroff N. BPEL4Job: A Fault-Handling Design for Job Flow Management[C] // Service-Oriented Computing-ICSOC 2007. 2007 : 27-42.

引证文献1

1赵小伟,张利永,韩燕波.VINCASim:一种网格工作流可靠性仿真工具[J].计算机科学,2009,36(11):143-147.

1孙广,刘建军.通过数据库复制功能实现数据库同步备份[J].电子世界,2012(22):97-99. 被引量：4
2张迎新.寻找失落的Excel数据[J].视窗世界,2005(4):59-60.
3王克强.引领新一代支付安全解决方案浪潮[J].中国信用卡,2014,0(11):31-32.
4Stefan Fagerholm.如何访问远程域控制器上的目录服务恢复模式[J].Windows IT Pro Magazine（国际中文版）,2008(11):40-40.
5徐燕飞.计算机数据库备份与恢复技术探讨[J].科技与创新,2017(3):31-31. 被引量：2
6裴艳琴,杨寿保.一种基于网格环境的服务合成模型[J].计算机科学,2006,33(1):91-94. 被引量：4
7蜂巢技巧[J].计算机应用文摘,2013(9):58-58.
8靳贺敏,杨枫.一种基于安全隔离的数据恢复模式的设计[J].福建电脑,2005,21(10):100-101.
9PSP掌机无法开机怎么办？[J].计算机应用文摘,2010(20):49-49.
10李炜,王凤达,马克.基于LS-SVM的多模型非线性主动预测容错控制[J].兰州理工大学学报,2009,35(2):70-75. 被引量：3

中山大学学报（自然科学版）

2008年第6期

浏览历史

内容加载中请稍等...

一种主动式的网格工作流可靠性保障方法被引量：1

参考文献14

同被引文献11

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种主动式的网格工作流可靠性保障方法 被引量：1

参考文献14

同被引文献11

引证文献1

相关作者

相关机构

相关主题

浏览历史

一种主动式的网格工作流可靠性保障方法被引量：1