The rapid growth of interconnected high performance workstations has produced a new computing paradigm called clustered of workstations computing. In these systems load balance problem is a serious impediment to achie...The rapid growth of interconnected high performance workstations has produced a new computing paradigm called clustered of workstations computing. In these systems load balance problem is a serious impediment to achieve good performance. The main concern of this paper is the implementation of dynamic load balancing algorithm, asynchronous Round Robin (ARR), for balancing workload of parallel tree computation depth-first-search algorithm on Cluster of Heterogeneous Workstations (COW) Many algorithms in artificial intelligence and other areas of computer science are based on depth first search in implicitty defined trees. For these algorithms a load-balancing scheme is required, which is able to evenly distribute parts of an irregularly shaped tree over the workstations with minimal interprocessor communication and without prior knowledge of the tree’s shape. For the (ARR) algorithm only minimal interprocessor communication is needed when necessary and it runs under the MPI (Message passing interface) that allows parallel execution on heterogeneous SUN cluster of workstation platform. The program code is written in C language and executed under UNIX operating system (Solaris version).展开更多
In recent years, high performance scientific computing under workstation cluster connected by local area network is becoming a hot point. Owing to both the longer latency and the higher overhead for protocol processin...In recent years, high performance scientific computing under workstation cluster connected by local area network is becoming a hot point. Owing to both the longer latency and the higher overhead for protocol processing compared with the powerful single workstation capacity, it is becoming severe important to keep balance not only for numerical load but also for communication load, and to overlap communications with computations while parallel computing. Hence,our efficiency evaluation rules must discover these capacities of a given parallel algorithm in order to optimize the existed algorithm to attain its highest parallel efficiency. The traditional efficiency evaluation rules can not succeed in this work any more. Fortunately, thanks to Culler's detail discuss in LogP model about interconnection networks for MPP systems, we present a system of efficiency evaluation rules for parallel computations under workstation cluster with PVM3.0 parallel software framework in this paper. These rules can satisfy above acquirements successfully. At last, two typical synchronous,and asynchronous applications are designed to verify the validity of these rules under 4 SGIs workstations cluster connected by Ethernet.展开更多
Dynamic task assignment and migration are the key technique to load balancing which plays an important role in the achievement of high performance in distributed computing system. In this paper, we describe the design...Dynamic task assignment and migration are the key technique to load balancing which plays an important role in the achievement of high performance in distributed computing system. In this paper, we describe the design and implementation of an online thread scheduling and migration system (S&M) based on a previous work of LWP -MPI. Experimental results show that performance is enhanced.展开更多
Parallel finite element method using domain decomposition technique is adapted to a distributed parallel environment of workstation cluster. The algorithm is presented for parallelization of the preconditioned conjuga...Parallel finite element method using domain decomposition technique is adapted to a distributed parallel environment of workstation cluster. The algorithm is presented for parallelization of the preconditioned conjugate gradient method based on domain decomposition. Using the developed code, a dam structural analysis problem is solved on workstation cluster and results are given. The parallel performance is analyzed.展开更多
This paper describes a PVM task scheduler designed and implemented by the authors. The scheduler supports selecting idle workstations, scheduling pool tasks and dynamically produced subtasks. It can improve resource u...This paper describes a PVM task scheduler designed and implemented by the authors. The scheduler supports selecting idle workstations, scheduling pool tasks and dynamically produced subtasks. It can improve resource utilization,reduce job response time and simplify programming.展开更多
Some testing results on DAWNING-1000, Paragon and workstation cluster are described in this paper. On the home-made parallel system DAWNING-1000 with 32 computational processors, the practical performance of 1.117 Gfl...Some testing results on DAWNING-1000, Paragon and workstation cluster are described in this paper. On the home-made parallel system DAWNING-1000 with 32 computational processors, the practical performance of 1.117 Gflops and 1.58 Gflops has been measured in solving a dense linear system and doing matrix multiplication, respectively. The scalability is also investigated. The importance of designing efficient parallel algorithms for evaluating parallel systems is emphasized.展开更多
This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to b...This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication over-head of parallel applications greatly and MMP-Solver outperforms the existing algorithms.展开更多
文摘The rapid growth of interconnected high performance workstations has produced a new computing paradigm called clustered of workstations computing. In these systems load balance problem is a serious impediment to achieve good performance. The main concern of this paper is the implementation of dynamic load balancing algorithm, asynchronous Round Robin (ARR), for balancing workload of parallel tree computation depth-first-search algorithm on Cluster of Heterogeneous Workstations (COW) Many algorithms in artificial intelligence and other areas of computer science are based on depth first search in implicitty defined trees. For these algorithms a load-balancing scheme is required, which is able to evenly distribute parts of an irregularly shaped tree over the workstations with minimal interprocessor communication and without prior knowledge of the tree’s shape. For the (ARR) algorithm only minimal interprocessor communication is needed when necessary and it runs under the MPI (Message passing interface) that allows parallel execution on heterogeneous SUN cluster of workstation platform. The program code is written in C language and executed under UNIX operating system (Solaris version).
文摘In recent years, high performance scientific computing under workstation cluster connected by local area network is becoming a hot point. Owing to both the longer latency and the higher overhead for protocol processing compared with the powerful single workstation capacity, it is becoming severe important to keep balance not only for numerical load but also for communication load, and to overlap communications with computations while parallel computing. Hence,our efficiency evaluation rules must discover these capacities of a given parallel algorithm in order to optimize the existed algorithm to attain its highest parallel efficiency. The traditional efficiency evaluation rules can not succeed in this work any more. Fortunately, thanks to Culler's detail discuss in LogP model about interconnection networks for MPP systems, we present a system of efficiency evaluation rules for parallel computations under workstation cluster with PVM3.0 parallel software framework in this paper. These rules can satisfy above acquirements successfully. At last, two typical synchronous,and asynchronous applications are designed to verify the validity of these rules under 4 SGIs workstations cluster connected by Ethernet.
文摘Dynamic task assignment and migration are the key technique to load balancing which plays an important role in the achievement of high performance in distributed computing system. In this paper, we describe the design and implementation of an online thread scheduling and migration system (S&M) based on a previous work of LWP -MPI. Experimental results show that performance is enhanced.
基金Project supported by Key Project Science Foundation of ShanghaiMunicipal Commission of Education (Grant No .03AZ03)
文摘Parallel finite element method using domain decomposition technique is adapted to a distributed parallel environment of workstation cluster. The algorithm is presented for parallelization of the preconditioned conjugate gradient method based on domain decomposition. Using the developed code, a dam structural analysis problem is solved on workstation cluster and results are given. The parallel performance is analyzed.
文摘This paper describes a PVM task scheduler designed and implemented by the authors. The scheduler supports selecting idle workstations, scheduling pool tasks and dynamically produced subtasks. It can improve resource utilization,reduce job response time and simplify programming.
文摘Some testing results on DAWNING-1000, Paragon and workstation cluster are described in this paper. On the home-made parallel system DAWNING-1000 with 32 computational processors, the practical performance of 1.117 Gflops and 1.58 Gflops has been measured in solving a dense linear system and doing matrix multiplication, respectively. The scalability is also investigated. The importance of designing efficient parallel algorithms for evaluating parallel systems is emphasized.
基金This work was supported by the National Natural Science Foundation of China (Grant No. 69933020) the "973" Program (Grant No. G1999032702).
文摘This paper discusses the compile time task scheduling of parallel program running on cluster of SMP workstations. Firstly, the problem is stated formally and transformed into a graph parti-tion problem and proved to be NP-Complete. A heuristic algorithm MMP-Solver is then proposed to solve the problem. Experiment result shows that the task scheduling can reduce communication over-head of parallel applications greatly and MMP-Solver outperforms the existing algorithms.