期刊文献+

一种适用于GPU图像处理算法的合并存储结构 被引量:2

A combined storage structure for image processing algorithms on GPU
下载PDF
导出
摘要 大多数图像处理算法都可利用GPU进行加速以达到更好的执行性能,但数据传输操作与核函数执行之间的调度策略问题仍是桎梏加速性能进一步提升的主要瓶颈。为了解决这个问题,通常采用GPU任务流将核函数执行与数据传输操作进行重叠,以隐藏部分数据传输与核函数执行耗时。但是,由于CUDA编程模型的特性以及GPU硬件资源的限制,在某些情况下,即使创建较多的任务流用于任务重叠,每个流上仍会存在串行执行的任务,导致加速效果无法进一步提升。因此,考虑利用CSS将待处理图像进行合并从而将单个流中的算子核函数及数据传输操作进行合并,以减少数据传输操作和核函数执行的固定代价及调用间隙。通过实验结果可知,提出的CSS结构不仅能在单流的情况下提高GPU图像处理算法执行性能,在多流的情况下其加速性能也得到了进一步提升,具有较好的实用性及可扩展性,适用于包含较多算子操作或较小尺寸图像批量处理的情况。此外,提出的方法对图像处理算法的GPU加速提供了新的研究思路。 Most image processing algorithms optimized by GPU can achieve better performance,but the scheduling strategy between data transmission and kernel execution is still the main bottleneck for further improvement in efficiency.To solve this problem,streams are usually used to overlap data transmission and kernel execution,in order to hide some of the data transmission and kernel execution time.However,due to the characteristics of the CUDA programming model and the limitations of GPU resources at hardware level,operations are still serialized when there are so many operations to be execute,even if numerous streams are created.In this paper,a new data storage structure,named Combined Storage Structure(CSS),is proposed,which improves the performance by merging small data transmissions on the single stream into a large one to reduce the fixed cost and the call gap of the operations of data transmission and kernel execution.Experimental results show that CSS can not only improve the performance of GPU-based image processing algorithms in the case of single stream,but also improve the acceleration performance in the case of multiple streams.CSS has good practicability and scalability,and it is suitable for the image processing operations that contain more operators or a large number of small-scale images.In addition,the proposed method provides a new research idea for GPU acceleration of image processing algorithms.
作者 左宪禹 张哲 黄祥志 葛强 张理涛 臧文乾 ZUO Xian-yu;ZHANG Zhe;HUANG Xiang-zhi;GE Qiang;ZHANG Li-tao;ZANG Wen-qian(Henan Key Laboratory of Big Data Analysis and Processing,Kaifeng 475004;Institute of Data and Knowledge Engineering,College of Computer and Information Engineering,Henan University,Kaifeng 475004;College of Science,Zhengzhou University of Aeronautics,Zhengzhou 450015;Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100094;Zhongke Langfang Institute of Spacial Information Application,Langfang 065000,China)
出处 《计算机工程与科学》 CSCD 北大核心 2020年第2期197-202,共6页 Computer Engineering & Science
基金 国家重点研发计划(2017YFD0301105) 国家自然科学基金(U1704122,U1604145) 河南省科技计划(182102210242,182102110065,192102210096) 河南省高等学校重点科研项目计划基础研究专项(20zx003) 河南省科技创新杰出青年基金(184100510004) 航空科学基金(2017ZD55014)。
关键词 图像处理 GPU CUDA流 合并存储结构 重叠 image processing GPU CUDA stream Combined Storage Structure(CSS) overlap
  • 相关文献

参考文献4

二级参考文献30

共引文献10

同被引文献41

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部