雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接...雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接构成一个较大的共享内存空间,实现了多台微机对遥感图像同步、方便、快捷的处理。通过对SAR图像几何纠正、图像滤波、监督分类串行算法的分析,发展了相应的并行处理算法,并在8台运行Linux操作系统,主频400MHz,内存256兆的Pentium II PC机上进行了实验,都获得了超线性加速比的实验结果。展开更多
A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method i...A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen PowerPC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%-20% in three benchmarks and around 8%-10% in another three. The average extra traffic caused by useless prefetches is only 7%-13% in the evaluation.展开更多
Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will...Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will suffer from poor data locality and will be able to exploit parallelism only when using a large number of processors. In this paper, a way towards supporting flexible data distribution (FDD) on software DSM system is presented. Small granularity-tunable blocks, the size of which can be set by compiler or programmer, are used to overlap the working data sets distributed among processors. The FDD was implemented on a software DSM system called JIAJIA. Compared with Block/Cyclic-block distribution schemes used by most DSM systems now, experiments show that the proposed way of flexible data distribution is more effective. The performance of the applications used in the experiments is significantly improved.展开更多
The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this ...The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this paper, we take the challenge to find where we should focus our efforts in the future design. The components of total system overhead of software DSM systems are analyzed in detail firstly. Based on a state-of-the-art software DSM system JIAJIA, we measure these components on Dawning parallel system and draw five important conclusions which are different from some traditional viewpoints. (1) The performance of the JIAJIA software DSM system is acceptable. For four of eight applications, the parallel ef ficiency achieved by JIAJIA is about 80%, while for two others, 70% efficiency can be obtained. (2) 40.94% interrupt service time is overlapped with waiting time. (3) Encoding and decoding diffs do not cost much time (<1%), so using hardware sup port to encode/decode diffs and send/receive messages is not worthwhile. (4) Great endeavours should be put to reduce data miss penalty and optimize synchronization operations, which occupy 11.75% and 13.65% of total execution time respectively.(5) Communication hardware overhead occupies 66.76% of the whole communication time in the experimental environment, and communication software overhead does not take much time as expected. Moreover, by studying the effect of CPU speed to system overhead, we find that the common speedup formula for distributed memory systems does not work under software DSM systems. Therefore, we design a new speedup formula special to software DSM systems, and point out that when the CPU speed increases the speedup can be increased too even if the network speed is fixed, which is impossible in message passing systems. Finally, we argue that JIAJIA system has desired scalability.展开更多
The Arun and Tista Rivers,which flow across the Himalayas,are commonly known as antecedent valleys that overcame the rapid uplift of the Higher Himalayan ranges.To clarify whether the idea of antecedent rivers is acce...The Arun and Tista Rivers,which flow across the Himalayas,are commonly known as antecedent valleys that overcame the rapid uplift of the Higher Himalayan ranges.To clarify whether the idea of antecedent rivers is acceptable,we investigated the geomorphology of the Himalayas between eastern Nepal and Bhutan Himalayas.The southern part of Tibetan Plateau,extending across the Himalayas as tectonically un-deformed glaciated terrain named as'Tibetan Corridor,'does not suggest the regional uplift of the Higher Himalayas.The 8,000-m class mountains of Everest,Makalu,and Kanchenjunga are isolated residual peaks on the glaciated terrain composed of mountain peaks of 4,000–6,000 m high.The Tibetan glaciers commonly beheaded by Himalayan glaciers along the great watershed of the Himalayas suggest the expansion of Himalayan river drainage by glaciation.For the narrow upstream regions of the Arun and Tista Rivers with less precipitation behind the range,it is hard to collect enough water for the power of down-cutting their channels against the uplifting Himalayas.The fission track ages of the Higher Himalayan Crystalline Nappe suggest that the Himalayas attained their present altitude by 11–10 Ma,and the Arun and Tista Rivers formed deep gorges across the Himalayas by headward erosion.展开更多
文摘雷达遥感图像的处理,由于受单机内存空间的限制,一般采用I/O函数随机访问磁盘图像文件的方式,因此完成整幅图像的处理需要耗费大量的时间,很难达到实际应用的需要。基于分布式共享内存网络系统JIAJIA软件将多台微机的物理内存连接构成一个较大的共享内存空间,实现了多台微机对遥感图像同步、方便、快捷的处理。通过对SAR图像几何纠正、图像滤波、监督分类串行算法的分析,发展了相应的并行处理算法,并在8台运行Linux操作系统,主频400MHz,内存256兆的Pentium II PC机上进行了实验,都获得了超线性加速比的实验结果。
基金the National Natural Science Foundation of China (No.60073018).
文摘A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records the interleaving string of INV (invalidation) and GETP (getting a remote page) operations for each cached page and analyzes the periodicity of the string when a page is invalidated on a lock or barrier. A prefetching request is issued after the lock or barrier if the periodicity analysis indicates that GETP will be the next operation in the string. Multiple prefetching requests are merged into the same message if they are to the same host. Performance evaluation with eight well-accepted benchmarks in a cluster of sixteen PowerPC workstations shows that the prefetching scheme can significantly reduce the page fault overhead and as a result achieves a performance increase of 15%-20% in three benchmarks and around 8%-10% in another three. The average extra traffic caused by useless prefetches is only 7%-13% in the evaluation.
基金The work of this paper is supported by the National '863' High-Tech Programme of China under grant No. 863-306-ZD01-02- 5 and N
文摘Page-based software DSM systems suffer from false sharing caused by the large sharing granularity, and only support one-dimension Block or Cyclicblock data distribution schemes. Thus applications running on them will suffer from poor data locality and will be able to exploit parallelism only when using a large number of processors. In this paper, a way towards supporting flexible data distribution (FDD) on software DSM system is presented. Small granularity-tunable blocks, the size of which can be set by compiler or programmer, are used to overlap the working data sets distributed among processors. The FDD was implemented on a software DSM system called JIAJIA. Compared with Block/Cyclic-block distribution schemes used by most DSM systems now, experiments show that the proposed way of flexible data distribution is more effective. The performance of the applications used in the experiments is significantly improved.
文摘The performance gap between software DSM systems and message passing platforms prevents the prevalence of software DSM system greatly, though great efforts have been delivered in this area in the past decade. In this paper, we take the challenge to find where we should focus our efforts in the future design. The components of total system overhead of software DSM systems are analyzed in detail firstly. Based on a state-of-the-art software DSM system JIAJIA, we measure these components on Dawning parallel system and draw five important conclusions which are different from some traditional viewpoints. (1) The performance of the JIAJIA software DSM system is acceptable. For four of eight applications, the parallel ef ficiency achieved by JIAJIA is about 80%, while for two others, 70% efficiency can be obtained. (2) 40.94% interrupt service time is overlapped with waiting time. (3) Encoding and decoding diffs do not cost much time (<1%), so using hardware sup port to encode/decode diffs and send/receive messages is not worthwhile. (4) Great endeavours should be put to reduce data miss penalty and optimize synchronization operations, which occupy 11.75% and 13.65% of total execution time respectively.(5) Communication hardware overhead occupies 66.76% of the whole communication time in the experimental environment, and communication software overhead does not take much time as expected. Moreover, by studying the effect of CPU speed to system overhead, we find that the common speedup formula for distributed memory systems does not work under software DSM systems. Therefore, we design a new speedup formula special to software DSM systems, and point out that when the CPU speed increases the speedup can be increased too even if the network speed is fixed, which is impossible in message passing systems. Finally, we argue that JIAJIA system has desired scalability.
基金This work was supported by Grants-in-Aid for Scientific Research of the Japanese Society for the Promotion of Science(JSPS KAKENHI)Grant Number 18H00766(principal investigator:Takashi Nakata)Grant Number 18KK0027(principal investigator:Yasuhiro Kumahara).
文摘The Arun and Tista Rivers,which flow across the Himalayas,are commonly known as antecedent valleys that overcame the rapid uplift of the Higher Himalayan ranges.To clarify whether the idea of antecedent rivers is acceptable,we investigated the geomorphology of the Himalayas between eastern Nepal and Bhutan Himalayas.The southern part of Tibetan Plateau,extending across the Himalayas as tectonically un-deformed glaciated terrain named as'Tibetan Corridor,'does not suggest the regional uplift of the Higher Himalayas.The 8,000-m class mountains of Everest,Makalu,and Kanchenjunga are isolated residual peaks on the glaciated terrain composed of mountain peaks of 4,000–6,000 m high.The Tibetan glaciers commonly beheaded by Himalayan glaciers along the great watershed of the Himalayas suggest the expansion of Himalayan river drainage by glaciation.For the narrow upstream regions of the Arun and Tista Rivers with less precipitation behind the range,it is hard to collect enough water for the power of down-cutting their channels against the uplifting Himalayas.The fission track ages of the Higher Himalayan Crystalline Nappe suggest that the Himalayas attained their present altitude by 11–10 Ma,and the Arun and Tista Rivers formed deep gorges across the Himalayas by headward erosion.