期刊文献+

机群管理系统中作业对象数据复制算法研究

A New Method for Replicating Scalable Data of Job Objects in Cluster Management System
下载PDF
导出
摘要 机群管理系统中节点的I/O负载过重以及系统的可扩展性是制约其效率的关键。通过使用作业对象的互复制以及基于快照的并发调度,可降低读/写操作的等待,同时也使各种操作在机群节点交替执行,提高了并行性。文中首先描述了作业对象的快照模型;给出了作业数据访问协议并实现了作业对象可扩展复制算法;最后对该算法进行了评价和分析。 Aim. We propose a new data replication method based on a snapshot model in order to achieve the more efficient utilization of I/O resources. Section 1 in the full paper discusses job objects in cluster management system. Section 2 presents the replication protocol of job objects. Its two subsections are: the snapshot of job objects (subsection 2.1) and the protocol of job object operations (subsection 2.2). In section 2, we reduce the waiting time for reading and writing through the concurrent scheduling based on snapshot. Section 3 explains the concurrent scheduling algorithm. In this section, we implement the scheduling algorithm to have concurrent access to job objects. The scheduling algorithm uses scheduler to check conflicts and only allows valid operations to be stored, thus ensuring the data consistency. In section 4, we did experiments on connecting 100 clients to the cluster server to track the three operations of job creation, job execution and job status inquiry. The experimental results, shown in Figs. 3 and 4, indicate preliminarily that the number of nodes of a replication cluster increases with increasing number of inquiries completed within a unit of time, thus consuming less response time and raising the efficiency by around 30%.
出处 《西北工业大学学报》 EI CAS CSCD 北大核心 2008年第5期566-569,共4页 Journal of Northwestern Polytechnical University
关键词 可扩展 作业分析 作业对象 快照模型 机群管理 scalability, job analysis, job object, snapshot model, cluster management system
  • 相关文献

参考文献5

  • 1Netshepherd & SystemScope/JobCenter User's Guide, NEC Corporation, 2005. http ://www. nee. co. jp
  • 2Douglas T, Todd T, Miron L. Distributed Computing in Practice: The Condor Experience. Concurrency and Computation: Practice and Experience, 2005, 17(2-4): 323-356
  • 3汤小春,胡正国.集群环境下一种基于交易模型的空闲资源分配方法[J].西北工业大学学报,2004,22(1):16-20. 被引量:2
  • 4Amza C, Cox L, Zwaenepoel W. Conflict-Aware Scheduling for Dynamic Content Applications. Proceedings of the 4^th USENIX Symposium on Internet Technologies and Systems, Chanda, 2003
  • 5Wu S, Kemme B. Postges-R(SI): Combining Replica Control with Concurrency Control Based on Snapshot Isolation. In ICDE, Tokyo, Japan, 2005

二级参考文献7

  • 1[1]Craysoft. Introducing NQE. Cray Research Inc, Document Number IN-2153 2/97, 1997
  • 2[2]Albeaus B, Robert L, Henderson. Portable Batch System. Numerical Aerospace Simulation System Division NASA Ames Research Center, 1998, 10
  • 3[3]Zhou Songnian, Wang Jingwen. A Load Sharing Facility for Large, Heterogeneous Distributed Computer Systems. Technical Report CSRI-257, Computer Systems Research Institute University of Toronto, Canada, 1992
  • 4[4]Carl K, Stuart M. IBM LoadLeveler Administration Guide, Release 3.0. IBM. Document Number SC 23-3989, 1996
  • 5[5]Karl C, Ian F, et al. A Resource Management Architecture for Metacomputing System. Information Science Institute University of Southern California, Marinadel Rey. CA 90292-6695, 1997
  • 6[6]Steve J, Chapin, Dimitrios K, et al. Resource Management in Legion. Dept of Computer Science, University of Virginia, Charlottesville, 1997
  • 7[7]Hou Chao-Ju, Shin Kang. Implementation of Decentralized Load Sharing in Networked Workstation Using the Condor Pakage. Journal of Parallel and Distributed Computing, 1997, 40(2): 173~184

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部