机群管理系统中作业对象数据复制算法研究

A New Method for Replicating Scalable Data of Job Objects in Cluster Management System

下载PDF

导出

摘要机群管理系统中节点的I/O负载过重以及系统的可扩展性是制约其效率的关键。通过使用作业对象的互复制以及基于快照的并发调度,可降低读/写操作的等待,同时也使各种操作在机群节点交替执行,提高了并行性。文中首先描述了作业对象的快照模型;给出了作业数据访问协议并实现了作业对象可扩展复制算法;最后对该算法进行了评价和分析。 Aim. We propose a new data replication method based on a snapshot model in order to achieve the more efficient utilization of I/O resources. Section 1 in the full paper discusses job objects in cluster management system. Section 2 presents the replication protocol of job objects. Its two subsections are： the snapshot of job objects （subsection 2.1） and the protocol of job object operations （subsection 2.2）. In section 2, we reduce the waiting time for reading and writing through the concurrent scheduling based on snapshot. Section 3 explains the concurrent scheduling algorithm. In this section, we implement the scheduling algorithm to have concurrent access to job objects. The scheduling algorithm uses scheduler to check conflicts and only allows valid operations to be stored, thus ensuring the data consistency. In section 4, we did experiments on connecting 100 clients to the cluster server to track the three operations of job creation, job execution and job status inquiry. The experimental results, shown in Figs. 3 and 4, indicate preliminarily that the number of nodes of a replication cluster increases with increasing number of inquiries completed within a unit of time, thus consuming less response time and raising the efficiency by around 30%.

作者汤小春胡杰阎磊

机构地区西北工业大学计算机学院

出处《西北工业大学学报》 EI CAS CSCD 北大核心 2008年第5期566-569,共4页 Journal of Northwestern Polytechnical University

关键词可扩展作业分析作业对象快照模型机群管理 scalability, job analysis, job object, snapshot model, cluster management system

分类号 TP31 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献5

1Netshepherd & SystemScope/JobCenter User's Guide, NEC Corporation, 2005. http ://www. nee. co. jp
2Douglas T, Todd T, Miron L. Distributed Computing in Practice: The Condor Experience. Concurrency and Computation: Practice and Experience, 2005, 17(2-4): 323-356
3汤小春,胡正国.集群环境下一种基于交易模型的空闲资源分配方法[J].西北工业大学学报,2004,22(1):16-20. 被引量：2
4Amza C, Cox L, Zwaenepoel W. Conflict-Aware Scheduling for Dynamic Content Applications. Proceedings of the 4^th USENIX Symposium on Internet Technologies and Systems, Chanda, 2003
5Wu S, Kemme B. Postges-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation. In ICDE, Tokyo, Japan, 2005

二级参考文献7

1[1]Craysoft. Introducing NQE. Cray Research Inc, Document Number IN-2153 2/97, 1997
2[2]Albeaus B, Robert L, Henderson. Portable Batch System. Numerical Aerospace Simulation System Division NASA Ames Research Center, 1998, 10
3[3]Zhou Songnian, Wang Jingwen. A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Technical Report CSRI-257, Computer Systems Research Institute University of Toronto, Canada, 1992
4[4]Carl K, Stuart M. IBM LoadLeveler Administration Guide, Release 3.0. IBM. Document Number SC 23-3989, 1996
5[5]Karl C, Ian F, et al. A Resource Management Architecture for Metacomputing System. Information Science Institute University of Southern California, Marinadel Rey. CA 90292-6695, 1997
6[6]Steve J, Chapin, Dimitrios K, et al. Resource Management in Legion. Dept of Computer Science, University of Virginia, Charlottesville, 1997
7[7]Hou Chao-Ju, Shin Kang. Implementation of Decentralized Load Sharing in Networked Workstation Using the Condor Pakage. Journal of Parallel and Distributed Computing, 1997, 40(2): 173～184

共引文献1

1闫磊,石新景.用实时库进行集群环境下的资源管理[J].科学技术与工程,2008,8(5):1173-1176.

1秦军,孟丹,古志民.Java RMI及多线程技术在机群管理系统中的应用[J].计算机工程与应用,2004,40(13):108-110. 被引量：3
2胡建元,黄心汉,陈锦江.机器人装配作业分析及其阻抗控制[J].高技术通讯,1991,1(10):27-32. 被引量：1
3鲁慕周.基于C/S模式的校园机群系统的研究与开发[J].河北建筑工程学院学报,2014,32(2):85-88.
4秦军,古志民,孟丹.PCMD在机群管理系统中的应用及其改进[J].计算机工程与科学,2004,26(4):95-97. 被引量：1
5雷州,徐志伟,祝明发.机群管理系统的比较与评价[J].计算机科学,1999,26(8):23-26. 被引量：2
6朱璇,郑纬民,汪东升,杨广文.单一系统映象在机群管理中的实现[J].计算机工程与应用,2002,38(7):86-88. 被引量：5
7徐生林,郑卫红,杨成忠,周亚军,黄伟.测控任务并发调度管理的研究与设计[J].计算机工程与应用,2004,40(22):206-208. 被引量：1
8简岩,许道云.实时操作系统μC/OS-Ⅱ子任务扩展的一种改进方法[J].湘潭大学自然科学学报,2009,31(1):121-124. 被引量：1
9傅游,李丽丽,花嵘.渲染机群管理系统负载平衡算法的研究与实现[J].计算机技术与发展,2011,21(7):94-97. 被引量：2
10尤波,张立强.基于传感器信息的机器人精密装配作业分析[J].传感器技术,1995,14(2):26-28. 被引量：1

西北工业大学学报

2008年第5期

浏览历史

内容加载中请稍等...

机群管理系统中作业对象数据复制算法研究

参考文献5

二级参考文献7

共引文献1

相关作者

相关机构

相关主题

浏览历史