期刊文献+

GPGPU和CUDA统一内存研究现状综述

Survey on GPGPU and CUDA Unified Memory Research Status
下载PDF
导出
摘要 在大数据背景下,随着科学计算、人工智能等领域的快速发展,各领域对硬件的算力要求越来越高。图形处理器(GPU)特殊的硬件架构,使其适合进行高并行度的计算,并且近年来GPU与人工智能、科学计算等领域互相发展促进,使GPU功能细化,逐渐发展出了成熟的通用图形处理器(GPGPU),目前GPGPU已成为中央处理器(CPU)最重要的协处理器之一。然而,GPU硬件配置在出厂后不容易更改且显存容量有限,在处理大数据集时显存容量不足的缺点对计算性能造成较大的影响。统一计算设备架构(CUDA)6.0推出了统一内存,使GPGPU和CPU可以共享虚拟内存空间,以此来简化异构编程和扩展GPGPU可访问的内存空间。统一内存为GPGPU处理大数据集提供了一项可行的解决方案,在一定程度上缓解了GPU显存容量较小的问题,但是统一内存的使用也带来了一些性能问题,如何在统一内存中做好内存管理成为性能提升的关键。本研究对CUDA统一内存的发展和应用进行综述,包括CUDA统一内存的特性、发展、优势和局限性以及在人工智能、大数据处理系统等领域的应用和未来的发展前景,为未来使用和优化CUDA统一内存的研究工作提供有价值的参考。 In the context of big data,the rapid advancement of fields such as scientific computing and artificial intelligence,there is an increasing demand for high computational power across various domains.The unique hardware architecture of the Graphics Processing Unit(GPU)makes it suitable for parallel computing.In recent years,the concurrent development of GPUs and fields such as artificial intelligence and scientific computing has enhanced GPU capabilities,leading to the emergence of mature General-Purpose Graphics Processing Units(GPGPUs).Currently,GPGPUs are one of the most important co-processors for Central Processing Units(CPUs).However,the fixed hardware configuration of the GPU after delivery and its limited memory capacity can significantly hinder its performance,particularly when dealing with large datasets.To address this issue,Compute Unified Device Architecture(CUDA)6.0 introduces unified memory,allowing GPGPU and CPU to share a virtual memory space,thereby simplifying heterogeneous programming and expanding the GPGPU-accessible memory space.Unified memory offers a solution for processing large datasets on GPGPUs and alleviates the constraints of limited GPGPU memory capacity.However,the use of unified memory introduces performance issues.Effective data management within unified memory is the key to enhancing performance.This article provides an overview of the development and application of CUDA unified memory.It covers topics such as the features and evolution of unified memory,its advantages and limitations,its applications in artificial intelligence and big data processing systems,and its prospects.This article provides a valuable reference for future work on applying and optimizing CUDA unified memory.
作者 庞文豪 王嘉伦 翁楚良 PANG Wenhao;WANG Jialun;WENG Chuliang(School of Data Science and Engineering,East China Normal University,Shanghai 200062,China;Research Institute of Interdisciplinary Innovation,Zhejiang Laboratory,Hangzhou 310000,Zhejiang,China)
出处 《计算机工程》 CAS CSCD 北大核心 2024年第12期1-15,共15页 Computer Engineering
基金 国家自然科学基金(62272171) 浙江省“尖兵”“领雁”研发攻关计划(2022C04006)。
关键词 通用图形处理器 统一内存 显存超额订阅 数据管理 异构系统 General-Purpose Graphics Processing Unit(GPGPU) unified memory memory oversubscription data management heterogeneous system
  • 相关文献

参考文献3

二级参考文献108

  • 1Xi S, Babarinsa O, Athanassoulis M, Idreos S. Beyond the wall: Near-Data processing for databases. In: Proc. of the Int'l Workshop on Data Management on New Hardware. 2015. [doi: 10.1145/2771937.2771945 ].
  • 2Aingaran K, Smcntek D, Wicki T, Jairath S, Konstadinidis G, Leung S, Loewenstein P, McAllister C, Phillips S, Radovic Z, Sivaramakfishnan R. M7: Oracle's next-generation spare processor. IEEE Micro, 2015,2:36-45. [doi: 10.1109/MM.2015.35].
  • 3Choi SH, Park N, Song YH, Lee SW. ASiPEC: An application specific instruction-set processor for high performance entropy coding. In: Proc. of the Ubiquitous Computing Application and Wireless Sensor. Springer-Verlag, 2015.67-75. [doi: 10.1007/978- 94-017-9618-7_7].
  • 4Francisco P. The Netezza data appliance architecture: A platform for high performance data warehousing and analytics. IBM Redbooks, 2011.
  • 5Becher A, Bauer F, Ziener D, Teich J. Energy-Aware SQL query acceleration through FPGA-based dynamic partial reconfiguration. In: Proc. of 2014 the 24th Int'l Conf. on Field Programmable Logic and Applications (FPL). IEEE, 2014. 1-8. [doi: 10.1109/FPL. 2014.6927502].
  • 6Mueller R, Teubner J, Alonso G. Glacier: A query-to-hardware compiler. In: Proc. of the 2010 ACM SIGMOD Int'l Conf. on Management of Data. ACM Press, 2010.1159-1162. [doi: 10.1145/1807167.1807307].
  • 7Dennl C, Ziener D, Teich J. On-the-Fly composition of FPGA-based SQL query accelerators using a partially reconfigurable module library. In: Proe. of the Annual IEEE Symp. on Field-Programmable Custom Computing Machines. IEEE, 2012. 45-52. [doi: 10.1109/FCCM.2012.18].
  • 8Woods L, Istvlin Z, Alonso G. Ibex: An intelligent storage engine with support for advanced SQL offloading. Proc. of the VLDB Endowment, 2014,7(11):963-974. [doi: 10.14778/2732967.2732972].
  • 9Scofield TC, Delmerico JA, Chaudhary V, Valente G. Xtremedata dbx: An FPGA-based data warehouse appliance. Computing in Science & Engineering, 2010,12(4):66-73. [doi: 10.1109/MCSE.2010.93].
  • 10Sukhwani B, Min H, Thoennes M, Dube P, Iyer B, Brezzo B, Dillenberger D, Asaad S. Database analytics acceleration using FPGAs. In: Proc. of the 21st Int'l Conf. on Parallel Architectures and Compilation Techniques. ACM Press, 2012.411-420. [doi: 10.1145/2370816.2370874].

共引文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部