Shared Memory Processors (SMP) workstation clusters are becoming more and more popular. To optimize communication between the workstations, a new graph partition problem was developed to schedule tasks in SMP clusters...Shared Memory Processors (SMP) workstation clusters are becoming more and more popular. To optimize communication between the workstations, a new graph partition problem was developed to schedule tasks in SMP clusters. The problem is NP-complete and a heuristic algorithm was developed based on Lee, Kim and Park's algorithm. Experimental results indicate that our algorithm outperforms theirs, especially when the number of partitions is large. This algorithm can be integrated in a parallelizing compiler as a back end optimizer for the distributed code generator.展开更多
Using commodity SMPs (shared memory processors) to build cluster-based supercomputer has become a mainstream trend.Yet programming this kind of supercomputer system requires an environment support both message passing...Using commodity SMPs (shared memory processors) to build cluster-based supercomputer has become a mainstream trend.Yet programming this kind of supercomputer system requires an environment support both message passing and shared memory programming. This paper describes our preliminary work in an effort to target BSP library for cluster of SMPs. In order to exploit the maximum performance potential that a cluster of SMPs brings, we adopt thread technique to reduce system overhead and to exploit the capacity of SMPs. A fore-layer synchronization mechanism is proposed to support barrier synchronization within an SMP node, a group of SMP nodes and the whole cluster respectively. A comparison is made between our BSP library and the currently available BSP libraries such as PUB.展开更多
文摘Shared Memory Processors (SMP) workstation clusters are becoming more and more popular. To optimize communication between the workstations, a new graph partition problem was developed to schedule tasks in SMP clusters. The problem is NP-complete and a heuristic algorithm was developed based on Lee, Kim and Park's algorithm. Experimental results indicate that our algorithm outperforms theirs, especially when the number of partitions is large. This algorithm can be integrated in a parallelizing compiler as a back end optimizer for the distributed code generator.
基金the National Natural Science Foundation of China(69603005), and the Science Foundation of Shanghai MunicipalCommission of Sc
文摘Using commodity SMPs (shared memory processors) to build cluster-based supercomputer has become a mainstream trend.Yet programming this kind of supercomputer system requires an environment support both message passing and shared memory programming. This paper describes our preliminary work in an effort to target BSP library for cluster of SMPs. In order to exploit the maximum performance potential that a cluster of SMPs brings, we adopt thread technique to reduce system overhead and to exploit the capacity of SMPs. A fore-layer synchronization mechanism is proposed to support barrier synchronization within an SMP node, a group of SMP nodes and the whole cluster respectively. A comparison is made between our BSP library and the currently available BSP libraries such as PUB.