期刊文献+

面向OpenCL的Mali GPU仿真器构建研究 被引量:2

Construction of Embedded Mali GPU Simulator for OpenCL
下载PDF
导出
摘要 针对嵌入式GPU通用计算的仿真器构建需求,通过对通用图形处理单元仿真器(general purpose graphics processing unit-simulator,GPGPU-sim)的计算核心、存储结构与Mali GPU的异同进行比较分析,首先建立面向OpenCL的Mali GPU仿真器的流程与结构,并设计计算单元数、寄存器数、最小并行粒度等GPU微体系结构参数的获取方法,在对GPGPU-sim进行修改和配置后,实现了对特定GPU架构的仿真器构建。使用矩阵相乘、图像处理等OpenCL程序对仿真器的准确性进行测试,以程序在仿真器和硬件平台上的执行周期数差距作为评估依据。实验结果表明:对于测试程序集中优化前的OpenCL程序,其中70%的程序在两个平台上的运行周期数差距不超过30%;对于优化后的OpenCL程序,其中90%的程序的运行周期数差距不超过30%。由此证明,构建的GPU仿真器能够满足OpenCL程序的仿真与性能评估。 The similarities and differences between GPGPU-sim and Mall GPU in computing cores and the storage structure are analyzed and compared, and simulating procedures and structures of Mali GPUs for OpenCL are built up to develop simulators for the general-purpose computing on embedded GPU. Methods to obtain the GPU microarchitecture parameters such as the computing unit number, the number of registers and the minimum parallel granularity are designed, and then the GPGPU-sim is configured and modified to construct specific GPU simulators. The accuracy of the simulator is tested through comparisons of running OpenCL programs, such as matrix multiplication and image processing on a real GPU and the simulator, and the difference between running cycles on the real GPU and the simulator is used as evaluation. Results show that the cycle differences are within 30% for about 70% OpenCL programs with simple implementation, and the cycle differences are within 30% for about 90% OpenCL programs with optimization. Therefore, it can be concluded that the constructed simulator meets the requirements of simulating and evaluating OpenCL programs on the embedded GPU.
出处 《西安交通大学学报》 EI CAS CSCD 北大核心 2015年第2期20-24,68,共6页 Journal of Xi'an Jiaotong University
基金 国家高技术研究发展计划资助项目(2012AA010904) 国家自然科学基金资助项目(61375023)
关键词 图形处理器 OPENCL 微体系结构参数 仿真器 GPU OpenCL mieroarchiteeture parameters simulator
  • 相关文献

参考文献6

  • 1NVIDIA.NVIDIA GeForce 8800 GPU architecture overview,TB-02787-001_V01[R].Santa Clara,CA,USA:NVIDIA Corporation,2006.
  • 2BAKHODA A,YUAN G L,FUNG W W L,et al.Analyzing CUDA workloads using a detailed GPU simulator[C]∥Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software.Piscataway,NJ,USA:IEEE,2009:163-174.
  • 3AAMODT T M,FUNG W W L,SINGH I,et al.GPGPU-Sim 3.x manual[EB/OL].(2012-08-08)[2013-08-08].http:∥gpgpu-sim.org/manual/index.php/GPGPU-Sim_3.x_Manual.
  • 4WONG H,PAPADOPOULOU M M,SADOOGHIALVANDI M,et al.Demystifying GPU microarchitecture through microbenchmarking[C]∥Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software.Piscataway,NJ,USA:IEEE,2010:235-246.
  • 5TAYLOR R,LI Xiaoming.A micro-benchmark suite for AMD GPUs[C]∥Proceedings of the 39th International Conference on Parallel Processing Workshops.Washington,DC,USA:IEEE Computer Society,2010:387-396.
  • 6杨海燕,史晓华,孙清越,晏望龙,严鑫,金茂忠.面向OpenCL的GPGPU微基准测试程序集的研究与实现[J].系统工程与电子技术,2013,35(12):2631-2642. 被引量:2

二级参考文献17

  • 1Khronos group. OpenCL the open standard for parallel programming of heterogeneous systems[EB/OL]. [2012 - 10 - 11]. http: // www. khronos, org/opencl/. com/content/cudazone/download/OpenCL/NVIDIA _ OpenCL _.
  • 2NVidia. OpenC1 programming guide for the CUDA architecture version 2.3[EB/OL]. [2012 - 10 - 11]. http://www, nvidia. ProgrammingGuide. pdf.
  • 3Henning J L. SPEC CPU2000: Measuring CPU performance in the new millennium[J]. Computer, 2000, 33(7) :28 - 35.
  • 4Paashish P, Pajay J, Lisy K J. Subsetting the SPEC CPU2006 benchmark suite[J]. ACM SIGARCH Computer Architecture News, 2007, 35(1):69-76.
  • 5Paashish P, Pajay J,Lisy K J. Analysis of redundancy and appli cation balance in the SPEC CPU2006 benchmark suite[C]// Proc. of the 34th Annual International Symposium on Computer Architecture, 2007 : 412 - 423.
  • 6Darryl G, Lawrence S, Evaluating the correspondence between train ing and reference workloads in SPEC CPU2006[J]. ACMSIGARCH Computer Architecture News, 2007, 35 (1): 122 - 129.
  • 7MediaBench Consortium. MediaBench II benchmark[EB/OL]. [2012 - 10 - 11]. http://euler, slu. edu/-fritts/mediabench/.
  • 8Guthaus M R, Ringenberg J S, Ernst D, et al. MiBench: a free, commercially representative embedded benchmark suite[C]//Proc. of the IEEE International Workshop on Workload Characterization, 2001: 3-14.
  • 9Transaction Processing Performance Council. TPC benchmarks [EB/OL]. [2012 - 10 - 11]. http: //www. tpc. urg/informa tion/benchmarks, asp.
  • 10Dongarra J, Bunch J, Moler C, et al. LINPACK[EB/OL]. [2012 - 10 - 11]. http://www, netlib, org/linpack/.

共引文献1

同被引文献7

引证文献2

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部