摘要
随着VLSI技术的发展,在单芯片上集成若干个处理器核的思想成为现实,现代GPU就是一个典型的多核处理器设备;由于面向计算密集型的应用发展非常迅速,当前的GPU又具有了较强的通用计算能力;全文首先介绍了CUDA和稀疏矩阵的相关知识;基于矩阵的CSR表示格式,文章提出了三种CUDA模型下的程序优化方法;论文分析并实现了这三种程序优化方法,在Geforce 9600GT上的实验结果表明,最大可以实现4倍左右的加速比。
With the development of VLSI technology, the idea of integrating multiple cores become realistic. Modern GPU is just a typical multi--core device. Because of the rapid evolution of computation--intensive application, the current GPU has the capability to complete the general computation. This paper first introduce the knowledge of CUDA and Sparse Matrix. Based on the CSR format of sparse matrix, three optimization methods of programme are presented under the CUDA model on the paper. They are all analyzed and implemented. Experiment is done on the Geforee 9600GT, and the final result shows that almost 4x speedup was achieved in contrast with the CPU computing.
出处
《计算机测量与控制》
CSCD
北大核心
2010年第8期1906-1908,1912,共4页
Computer Measurement &Control
基金
国家"863"基金项目(2009AA01Z110)