期刊文献+

基于嵌入式移动GPU的离散傅里叶变换并行优化 被引量:2

Parallelization of DFT Based on Embedded Mobile GPU
下载PDF
导出
摘要 GPGPU能够针对计算密集型的计算问题进行大规模的并行加速,为DFT在嵌入式平台上的高效实现提供了一种新的方式。基于Mali-T604嵌入式GPU实现了针对DFT和FFT的并行加速方案,并进行了实际测试。实验结果证明,所设计的并行方案能够在ARM嵌入式平台上有效加速DFT和FFT,可大大提升移动设备进行数字信号处理的实时性。 GPGPU can provide efficient parallel computing solution for the complex compute-intensive computing problem, which is a new way of the efficient implementation of DFT in the embedded platform. In the paper, the parallelization solution of DFT and FFT based on Mali-T604 GPU is proposed. The results of experiment show that the parallel scheme can effectively accelerate DFT and FFT on ARM embedded platform, which can greatly improve the real-time performance of digital signal processing.
作者 曾宝国 杨斌
出处 《单片机与嵌入式系统应用》 2016年第1期12-15,共4页 Microcontrollers & Embedded Systems
关键词 DFT FFT GPGPU Mali—T604 GPU 数字信号处理 ARM嵌入式系统 DFT FFT GPGPU Mali-T604 GPU digital signal processing ARM Embedded System
  • 相关文献

参考文献5

二级参考文献37

  • 1吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量:225
  • 2()wens J D, Houston M, Luebke D, et al. GPU computing [J]. Proceedings of the IEEE, 2008, 96(5): 879-899.
  • 3Owens J D, Luebke D, Govindaraju N, et al. A survey of general-purpose computation on graphics hardware [J]. Computer Graphics Forum, 2007, 26(1): 80-113.
  • 4Fatahalian K, Houston M. GPUs:a closer look [J]. ACM Queue, 2008, 6(2): 18 28.
  • 5Jang B, Mistry P, Sehaa D, et al. Data transformations enabling loop vectorization on multithreaded data parallel architectures [C] //Proceedings of the 15th ACM SIGPLAN Symposium on Principles ahd Practice of Parallel Programming. New York: ACM Press, 2010:353-354.
  • 6Liu Y X, Zhang E Z, Shen X P. A cross-input adaptive framework for GPU program optimizations [C] //Proceedings of IEEE International Symposium on Parallel & Distributed Processing. Los Alamitos: IEEE Computer Society Press, 2009, 1-10.
  • 7Ryoo S, Rodrigucs C I, Stone S S, et al. Program optimization space pruning for a multithreaded GPU [C]// Proceedings of the 6th Annual IEEE/ACM International Symposium on Code Generation and Optimization. New York: ACM Press, 2008:195-204.
  • 8Ryoo S, Rodrigues C l, Stone S S, el al. Optimization principles and application performance evaluation of a multithreaded GPU using CUDA [C] //Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM Press, 2008:73-82.
  • 9Jang 13, Do S, Pien H, etal. Architecture aware optimization targeting multithreaded stream computing[C] //Proceedings of the 2nd Workshop on General Purpose Processing onGraphics Processing Units, New York: ACM Press, 2009: 62-70.
  • 10Baskaran M M, Bondhugu/a U, Krishnamoorthy S, et al. A compiler framework for optimization of affine loop nests for GPGPUs [C] //Proceedings of the 22nd Annual International Conference on Supercomputing. New York: ACM Press, 2008:225-234.

共引文献21

同被引文献14

引证文献2

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部