期刊文献+

GPU实现的高速FIR数字滤波算法

High Speed FIR Digital Filtering on GPU
下载PDF
导出
摘要 针对目前基于GPU的FIR算法速度低、扩展性差的缺点,提出一种高速的多通道FIR数字滤波的并行算法,并利用平衡并行运算负载的技术以及降低内存访问密度的方法进行加速.该算法采用矩阵乘法的并行运算技术在GPU上建立并行滤波模型,通过每个线程在单个指令周期内执行2个信号运算,实现了多通道信号的高速滤波.实验结果表明,在GTX260+平台上,采用文中算法的平均加速比达到了203,效率超过40%,并且具有更好的扩展性. This paper proposes a massively parallel FIR filter algorithm with its GPU implementation to improve the efficiency and scalability of GPU based FIR.By the algorithm proposed,the problem is formulated as a matrix multiplication operation that offers sufficient data level parallelism for parallel filtering on modern GPUs.In addition,the GPU implementation guarantees that each thread could complete a two-signal operation within a single instruction cycle.Efficient and effective strategies for load balancing and memory mapping are also introduced to further improve the performance.The proposed algorithm and the corresponding GPU implementation could achieve an efficient multi-channel signal filtering.Experimental results on a GTX260+ graphics card prove that the FIR filter algorithm could be used to deliver on average a speed-up of 203X and an efficiency increase over 40%.
出处 《计算机辅助设计与图形学学报》 EI CSCD 北大核心 2010年第9期1435-1442,共8页 Journal of Computer-Aided Design & Computer Graphics
基金 NVidia Professor Partnership Award CUDA Center of Excellence
关键词 有限脉冲响应 数字滤波 并行计算 CUDA GPU finite impulse response(FIR) digital filter parallel computing CUDA GPU
  • 相关文献

参考文献20

  • 1Sophocles J O.Introduction to signal processing (photocopy version)[M].Beijing:Tsinghua University Press,1998:125-567.
  • 2Park J,Muhammad K,Roy K.High-performance FIR filter design based on sharing multiplication[J].IEEE Transactions on Very Large Scale Integration(VLSI) Systems,2003,11(2):244-253.
  • 3Voronenko Y,Püschel M.Multiplierless multiple constant multiplication[J].ACM Transactions on Algorithms,2007,3(2):11-20.
  • 4Mahesh R,Vinod A P.A new common subexpression elimination algorithm for realizing low-complexity higher order digital filters[J].IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,2008,27(2):217-229.
  • 5李莹,路卫军,于敦山,张兴.一种在FPGA上实现FIR数字滤波器的资源优化算法[J].北京大学学报(自然科学版),2009,45(2):222-226. 被引量:10
  • 6Mou Z J,Duhamel P.Fast FIR filtering:algorithms and implementations[J].Signal Processing,1987,13(4):377-384.
  • 7Cheng C,Parhi K K.Hardware efficient fast parallel FIR filter structures based on iterated short convolution[J].IEEE Transactions on Circuits and Systems I:Regular Papers,2004,51(8):1492-1500.
  • 8Conway R.Efficient residue arithmetic based parallel fixed coefficient FIR filters[C] //Proceedings of IEEE International Symposium on Circuits and Systems,Seattle:The Printing House,2008:1484-1487.
  • 9Cheng C,Parhi K K.Low-cost parallel FIR filter structures with 2-stage parallelism[J].IEEE Transactions on Circuits and Systems I:Regular Papers,2007,54(2):280-290.
  • 10吴恩华,柳有权.基于图形处理器(GPU)的通用计算[J].计算机辅助设计与图形学学报,2004,16(5):601-612. 被引量:226

二级参考文献65

  • 1王学梅,吴敏.基于FPGA的分布式算法FIR滤波器的设计实现[J].世界电子元器件,2004(10):65-67. 被引量:6
  • 2Dempster A G, Macledod M D. Use of minimum-adder multiplier blocks in FIR digital filters. IEEE transactions on circuits and system-Ⅱ: Analog and Digital Signal Processing, 1995, 42(9) : 569-577
  • 3Rawski M, Tomaszewicz P, Selvaraj H, et al. Efficient implementation of digital filters with use of advanced synthesis methods targeted FPGA architectures /8^th Euromicro conference on Digital System Design. Porto, Portugal, 2005, 0-7695-2433-8/05
  • 4Chapman K. Constant Coefficient Multipliers for the XC400E, Xilinx Technical Report, 1996
  • 5Wirthlin M J. Constant coefficient multiplication using look-up tables. Journal of VLSI Signal Processing, 21X)4, 36(1): 7-15
  • 6Yoo H, Anderson D V. Hardware-efficient distributed arithmetic architecture for high-order digital filters // ICASSP. Philadelphia, PA, USA, 2005, 0-7803-8874-7/ 05
  • 7Nguyen H T, Chatterjee A. Number-splitting with shift-and- add decomposition for power and hareware optimization in liner DSP synthesis. IEEE transactions on Very Large Scale Integration (VLSI) System, 2000, 8(4): 419-424
  • 8Mirzaei S, Hosangadi A, Kastner R. FPGA implementation of high speed FIR filters using add and shift method // International Conference on Computer Design. Las Vegas, Nevada, USA, 2006
  • 9Clark James H.The geometry engine:A VLSI geometry system for graphics[A].In:Computer Graphics Proceedings,Annual Conference Series,ACM SIGGRAPH,Boston,1982.127~133
  • 10Fuchs Herry,Poulton John.Pixel-planes:A VLSI-Oriented design for a raster graphics engine[J].VLSI Design,1981,2(3):20~28

共引文献233

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部