期刊文献+

基于OpenCL的图像积分图算法优化研究 被引量:6

Research on Image Integral Algorithm Optimization Based on OpenCL
下载PDF
导出
摘要 图像积分图算法在快速特征检测中有着广泛的应用,通过GPU对其进行性能加速有着重要的现实意义。然而由于GPU硬件架构的复杂性和不同硬件体系架构间的差异性,完成图像积分图算法在GPU上的优化,进而实现不同GPU平台间的性能移植是一件非常困难的工作。在分析不同GPU平台底层硬件架构的基础上,从片外访存带宽利用率、计算资源利用率和数据本地化等多个角度考察了不同优化方法在不同GPU硬件平台上对性能的影响。并在此基础上实现了基于OpenCL的图像积分图算法。实验结果表明,优化后的算法在AMD和NVIDIA GPU上分别取得了11.26和12.38倍的性能加速,优化后的GPU kernel比NVIDIA NPP库中的相应函数也分别取得了55.01%和65.17%的性能提升。验证了提出的优化方法的有效性和性能可移植性。 Image integral algorithm is widely used in fast feature detection,and improving the performance of this algo- rithm through GPU has an important practical significance. However, due to the complexity of the GPU hardware archi- tecture and the architectural differences between different GPUs,how to complete the optimization of this algorithm and achieve performance portability on different GPU platforms is still a hard work. This paper analysed the differences be- tween theunderlying hardware architectures of GPU, and studied the effects of performance on different GPU platforms using different optimization methods from the utilization of the off-chip memory bandwidth,the utilization of the compu- ting resource, data locality and other aspects. And based on this, we implemented the image integral algorithm based on OpenCL. Experimental results show that optimized algorithm gets 11. 26 and 12. 38 times speedup on AMD and NVIDIA GPU respectively, and the performance of the optimized kernel improves 55.01% and 65.17 %than the CUDA version in NVIDIA NPP library, which verifies the effectiveness and cross-platform ability of optimization methods.
出处 《计算机科学》 CSCD 北大核心 2013年第2期1-7,共7页 Computer Science
基金 国家自然科学基金资助项目(61133005 61272136 61100073) 国家863项目(2012AA010902 2012AA010903) ISCAS-AMD联合fusion软件中心资助
关键词 OPENCL GPU 图像积分图算法 跨平台 OpenCL,GPU, Image integral algorithm, Across platform
  • 相关文献

参考文献15

  • 1Lindholm E,Nickolls J,Oberman S. NVIDIA Tesla:A Unified Graphics and Computing Architecture[J].IEEE Micro,2008,(02):39-55.
  • 2Bordoloi U D. Optimization Techniques:Image Convolution[M].2011.
  • 3Papadopoulou M,Sadooghi-Alvandi M,Wong H. Micro-benchmarking the GT200 GPU[R].Computer Group.ECE,University of Toronto,2009.
  • 4Jang B,Do S,Pien H. Architecture-Aware optimization Targeting Multithreaded Stream Computing[A].New York:acm Press,2009.62-70.
  • 5Hillis W D,Steele G L Jr. Data Parallel Algorithms[J].Communications of the ACM Special issue on parallelism,1986,(12):1170-1183.
  • 6KHRONOS OpenCL Working Group. The OpenCL Specification V1.1[S].2010.
  • 7Ryoo S,Rodrigues C I,Stone S S. Programoptimization space pruning for a multithreaded GPU[A].New York:acm Press,2008.195-204.
  • 8AMD Corporation. Accelerated Parallel Processing OpenCLTM[M].2011.
  • 9Owens J D,Luebke D,Govindaraju N. A surveyof generalpurpose computation on graphics hardware[J].Computer Graphics Forum,2007,(01):80-113.doi:10.1111/j.1467-8659.2007.01012.x.
  • 10Jang B,Schaa D,Mistry P. Exploiting Memory Access Patterns to Improve Memory Performance in Data Parallel Architectures[J].Parallal and Distributed Systems,2011,(01):105-118.

二级参考文献21

  • 1Owens J D, Luebke D, Govindaraju N, et al. A Survey of General-Purpose Computation on Graphics Hardware [J]. Computer Graphics Forum, 2007,26(1) :80-113.
  • 2http://ati, amd. com/technology/streamcomputing/product_ firestream_9270, html.
  • 3Luebke D, Harris M, Krger J, et al. GPGPU: General Purpose Computation on Graphics Hardware[C]//Proc of ACM SIGGRAPH 2004 Course Notes, 2004.
  • 4NCSABench[CP/OL]. [2009-04-13]. http://www.ncsa. uiuc. edu/Userlnfo/Perf/NCSAbench/.
  • 5Pike A. DirectX 8 Tutorial[R]. Retrieved March 15, 2006.
  • 6Shreiner D, Woo M, Neider J, et al. OpenGL Programming Manual[M]//OpenGL ARB, Boston. 5th ed. Addison Wesley,2005.
  • 7Microsoft. High-Level Shader Language[R]. 2003.
  • 8Kessenich J, Baldwin D, Rost R. The OpenGL Shading Language[R]. 2003.
  • 9Mark W R, Glanville R S, Akeley K, et al. Cg: A System for Programming Graphics Hardware in a Clike Language[C]//Proc of SIGGRAPH'03,2003 : 896-907.
  • 10McCool M D, Qin Z, Popa T S. Shader Metaprogramming[C]//Proc of the ACM SIGGRAPH/EUROGRAPHICS Conf on Graphics hardware, 2002 : 57-68.

共引文献10

同被引文献30

引证文献6

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部