期刊文献+

一种基于并行度分析模型的GPU功耗优化技术 被引量:13

A GPU Low-Power Optimization Based on Parallelism Analysis Model
下载PDF
导出
摘要 随着硬件功能的不断丰富和软件开发环境的逐渐成熟,GPU开始被应用于通用计算领域,协助CPU加速程序的运行.为了追求高性能,GPU往往包含成百上千个核心运算单元.高密度的计算资源,使得其在性能远高于CPU的同时功耗也高于CPU.功耗问题已经成为制约GPU发展的重要问题之一.DVFS技术被广泛应用于处理器的低功耗优化,而对GPU进行相应研究的前提是对其程序运行过程进行分析和建模,从而可以根据应用程序的特征来确定优化策略.此外,GPU主要由图形处理器芯片和片外的DRAM组成,有研究指出针对这类系统的功耗优化应当综合考虑处理器和存储器,使二者可以互相协调以达到更好的优化效果.文中在一个已有的基于程序并行度分析的GPU性能模型的基础上,综合考虑计算部件与存储部件的功耗,建立了性能约束条件下的GPU功耗优化模型.对于给定的程序,在满足性能约束的前提下,以功耗最优为目标分别给出处理器和存储器的DVFS优化策略.作者选取了9个测试用例在3种模拟平台上进行了实验验证,结果表明文中的方法可以在满足性能约束条件10%的误差范围内获得最优的GPU能量消耗. With the continues development of hardware and software,GPU has been used in general purpose computation field,accelerating applications for CPU.To achieve high computing performance,GPU typically includes hundreds of computing units.The high density of computing resource on chip brings in high power consumption as well as high performance.The power consumption problem has become one of the most important problems for the development of GPU.The DVFS technique is widely used to optimize power consumption for processors.However,applying the DVFS technique to GPU depends on the analysis of program execution on GPU,so that optimization strategy can be chosen according to the program feature.Besides,GPU is comprised of a processor chip and an off-chip DRAM system.Some previous researches point out that the power consumption optimization for such a system should involve both the processor and the DRAM,to achieve better optimization effect.Based on an existing GPU analytical model,this paper proposes a GPU power optimization model under performance restriction,involving both the processor and the DRAM on GPU.For a given program,the model gives the DVFS strategies for the processor and the DRAM respectively with an appointed performance restriction.The authors choose nine test cases to evaluate the model on three simulated GPU platforms.The experimental results show that the model can achieve optimal energy consumption while the performance deviation from the restriction is less than 10%.
出处 《计算机学报》 EI CSCD 北大核心 2011年第4期705-716,共12页 Chinese Journal of Computers
基金 国家自然科学基金(90620162)资助
关键词 GPU 并行度模型 功耗模型 功耗优化 GPU parallelism model power model low power optimization
  • 相关文献

参考文献20

  • 1http://ati, amd. com/technology/streamcomputing/product_ FireStream_9250. html.
  • 2Luebke D, Harris M, Govindaraju N, Lelohn A, HoustonM, Owens J, Segal M, Papakipos M, Buck I. GPGPU General-purpose computation on graphics hardware//Pro ceedings of the 2006 ACM/IEEE Conference on Supercom puting(SC'06). Tampa, Florida, 2006.
  • 3Fan Xiaobo, Ellis Carla S, Lebeck Alvin R. The synergy be-tween power aware memory systems and processor voltage scaling//Proceedings of the Workshop on Power-AwareComputer Systems ( PACS-03 ). New York, NY, USA, 2003: 164-179.
  • 4Hong S, Kim H. An analytical model for a GPU architecturewith memory-level and thread-level parallelism awareness// Proceedings of the 36th Annual International Symposium onComputer Architecture (ISCA'09). Austin, TX, USA, 2009:152-163.
  • 5NVIDIA Corporation. CUDA Programming Guide, Version 2.1.
  • 6Burd T, Brodersen R. Design issues for dynamic voltagescaling//Proceedings of the 2000 International Symposium on Low Power Electronics and Design (ISLPED' 00). Rapallo, Italy, 2000:9-14.
  • 7Bakhoda Ali, Yuan George, Fung Wilson W L, Wong Henry, Aamodt "For M. Analyzing CUDA workloads using adetailed GPU simulator//Proceedings of the IEEE Interna tional Symposium on Performance Analysis of Systems and Software (ISPASS). Boston, MA, 2009:163-174.
  • 8Brooks D, Tiwari V, Martonosi M. Wattch: A framework for architectural-level power analysis and optimizations//Proceedings of lhe 271h International Symposium on Computer Architecture ( ISCA ). Vancouver, British Columbia, Canada, 2000; 83-94.
  • 9Ramani K, Ibrahim A, Shimizu D. PowerRed.- A flexible modeling framework for power efficiency exploration inGPUs//Proceedings of the Workshop on General Purpose Processing on Graphics Processing Units. New York, NY, USA, 2007:185-192.
  • 10Simunic T, Benini L, De Micheli G. Cycle accurate simulation of energy consumption in embedded systems//Proceed- ings of the 36th Annual ACM/IEEE Design Automation Con ference(DAC). Atlanta, Georgia, 1999:867-872.

同被引文献240

引证文献13

二级引证文献65

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部