期刊文献+

基于指导语句的CUDA程序性能分析工具研究与实现 被引量:1

Research and Implementation of Performance Analysis Tool for CUDA Programs with Directive
下载PDF
导出
摘要 近年来,GPU的快速发展与NVIDIA公司推出的CUDA技术,推动着GPU在高性能计算领域中的应用。研究并实现CUDA程序性能分析工具,对充分利用GPU的计算优势和提高CUDA架构下并行程序的执行性能具有重要的意义。该文分析了GPU硬件平台的特点和CUDA并行编程模型,结合CPU集群环境下并行程序的性能分析,设计并实现了一种基于指导语句的CUDA程序性能分析工具,并实验验证了其在不同GPU硬件平台上的有效性。 In recent years,the rapid expansion of graphics processing unit(GPU) as well as the computer unified device architecture(CUDA) technology proposed by NVIDIA pushes forward the application of GPU in the field of high performance computing(HPC).In this paper,GPU’s architecture and CUDA programming model are introduced first.According to the method of parallel program performance analysis in CPU cluster mode,a performance analysis tool for CUDA programs based on directive is designed and implemented.Experiment results validate the validity of this performance analysis tool on different GPU hardware platforms.
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2012年第2期280-284,共5页 Journal of University of Electronic Science and Technology of China
基金 教育部科学技术研究重点项目(108008) 北京市教委重点学科(XK100080537)
关键词 CUDA 指导语句 高性能计算 性能分析 程序优化 CUDA directive HPC performance analysis program optimization
  • 相关文献

参考文献2

二级参考文献36

  • 1Marc Gonzalez,Albert Serra,et al.Applying interposition techniques for performance analysis of OpenMP parallel applications[A].14th International Parallel and Distributed Processing Symposium[C].Cancun,Mexico:IPDPS,2000.235-240.
  • 2Dixie Hisley,et al.Porting and performance evaluation of irregular codes using openmp[J].Concurrency-Practice and Experience,2000,(12):1241-1259.
  • 3Intel.GuideView[EB/OL].http://www.intel.com/software/products/kappro/perfvis.htm.
  • 4Yoshiaki Sakae,et al.Preliminary evaluation of dynamic load balancing using loop re-partitioning on Omni/SCASH[A].3st International Symposium on Cluster Computing and the Grid(CCGrid 2003)[C].Tokyo,Japan,2003.463-470.
  • 5M K Bane,G D Riley.Automatic overheads profiler for openMP codes[A].Second European Workshop on OpenMP(EWOMP 2000)[C].Edinburgh,Scotland,2000.
  • 6Intel.VTune analyzer[EB/OL].http://www.intel.com/software/products/vtune/.
  • 7University of Oregon.TAU[EB/OL].http://www.cs.uoregon.edu/research/paracomp/proj/tau/.
  • 8Bernd Mohr,Allen D.Malony,et al.Towards a performance tool interface for OpenMP:An approach based on directive rewriting[A].3st European Workshop on OpenMP(EWOMP 2001)[C].Barcelona,Spain,2001.
  • 9Sato M,,Satoh S,Kusano K,et al.Design of OpenMP compiler for an SMP cluster. Proceedings of the1st European Workshop on OpenMP . 1999
  • 10Seung J M,Rudolf E.Combined compile-time and runtime-driven,pro-active data movement in software DSM systems. Proceedings of7th Workshop on Languages,Compilers,and Run-time Support for Scalable Systems . 2004

共引文献5

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部