期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer 被引量:3
1
作者 王锋 杨灿群 +3 位作者 杜云飞 陈娟 易会战 徐炜遐 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第5期854-865,共12页
In this paper we present the programming of the Linpack benchmark on TianHe-1 system,the first petascale supercomputer system of China,and the largest GPU-accelerated heterogeneous system ever attempted before.A hybri... In this paper we present the programming of the Linpack benchmark on TianHe-1 system,the first petascale supercomputer system of China,and the largest GPU-accelerated heterogeneous system ever attempted before.A hybrid programming model consisting of MPI,OpenMP and streaming computing is described to explore the task parallel,thread parallel and data parallel of the Linpack.We explain how we optimized the load distribution across the CPUs and GPUs using the two-level adaptive method and describe the implementation in details.To overcome the low-bandwidth between the CPU and GPU communication,we present a software pipelining technique to hide the communication overhead.Combined with other traditional optimizations,the Linpack we developed achieved 196.7 GFLOPS on a single compute element of TianHe-1.This result is 70.1% of the peak compute capability,3.3 times faster than the result by using the vendor's library.On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0.563 PFLOPS,which made TianHe-1 the 5th fastest supercomputer on the Top500 list in November,2009. 展开更多
关键词 petascale LINPACK GPU HETEROGENEOUS SUPERCOMPUTER
原文传递
A Large-Scale Study of Failures on Petascale Supercomputers 被引量:2
2
作者 Rui-Tao Liu Zuo-Ning Chen 《Journal of Computer Science & Technology》 SCIE EI CSCD 2018年第1期24-41,共18页
With the rapid development of supercomputers, the scale and complexity are ever increasing, and the reliability and resilience are faced with larger challenges. There are many important technologies in fault tolerance... With the rapid development of supercomputers, the scale and complexity are ever increasing, and the reliability and resilience are faced with larger challenges. There are many important technologies in fault tolerance, such as proacrive failure avoidance technologies based on fault prediction, reactive fault tolerance based on checkpoint, and scheduling technologies to improve reliability. Both qualitative and quantitative descriptions on characteristics of system faults are very critical for these technologies, This study analyzes the source of failures on two typical petascale supercomputers called Sunway BlueLight (based on multi-core CPUs) and Sunway TaihuLight (based on heterogeneous manycore CPUs). It uncovers some interesting fault characteristics and finds unknown correlation relationship among main components' faults. Finally the paper analyzes the failure time of the two supercomputers in various grains of resource and different time spans, and builds a uniform multi-dimensional failure time model for petascale supereomputers. 展开更多
关键词 petascale supercomputer fault characteristic correlation relationship MULTI-DIMENSION failure time model
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部