基于GPGPU的LDPC解码访存优化技术

A memory optimization strategy of LDPC decoding based on GPGPU

下载PDF

导出

摘要低密度奇偶校验码(low-density parity-check,LDPC)作为一类高性能的差错控制编码被用于多个通信标准中,但解码算法计算量巨大,限制了其潜能,基于通用图形处理器(general-purpose GPU,GPGPU)的LDPC解码器由于其灵活性,近年来备受关注。深入分析了LDPC解码算法特性,提出Tanner图的交织器表示,简化了解码算法;结合GPU体系结构特点提出自顶向下的多步优化策略,充分挖掘了GPU的加速性能。实验结果显示,平衡计算访存负载、合并对齐全局访存、充分利用寄存器资源,可显著提高GPU性能;相对于CPU实现,可取得383倍的加速,综合性能优于现有的基于GPU的LDPC解码实现。 As powerful, error correcting codes, low-density parity-check （LDPC） codes have been adopted by new emerging stand- ards for digital communication; however, their performance gain is constrained due to their huge computation demand. The GPU- based LDPC decoder is a recent hot research subject for its lower cost and better flexibility. We analyze the parallelism property of SPA （sum product algorithm） and propose an easy way to translate the Tanner graph into an interleaver. From a hardware ar chitecture perspective, we propose an efficient up-to-down multi-stage optimization strategy which releases GPU＇s acceleration power to its limit gradually. Experimental results demonstrate that balancing computation and memory access, coalescing global memory accessing and aggressive usage o{ the o^chip high speed resource （e. g. , shared memory and registers） can promote the performance significantly. The proposed decoder can achieve 383x-speedup compared to CPU-based decoder and also outperfor mances existing GPU-based ones in terms of overall performance.

作者原略超张洋唐川邢座程

机构地区国防科技大学计算机学院

出处《中国科技论文》 CAS 北大核心 2013年第7期626-632,共7页 China Sciencepaper

基金高等学校博士学科点专项科研基金资助项目(20114307110001) 国家自然科学基金资助项目(60873016 61170083)

关键词低密度奇偶校验码解码器和积算法通用图形处理器优化策略并行计算 LDPC decoder sum-product algorithm general purpose graphic processor unit optimization strategy parallel cornputing

分类号 TN911.22 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献1

1Gabriel Falcao,Student Member,Shinichi Yamagiwa,Vitor Silva,Leonel Sousa,Senior Member.Parallel LDPC Decoding on GPUs Using a Stream-Based Computing Approach[J].Journal of Computer Science & Technology,2009,24(5):913-924. 被引量：2

二级参考文献30

1Gallager R G. Low-density parity-check codes. IRE Transactions on Information Theory, 1962, 8(1): 21-28.
2Mackay D J C, Neal R M. Near Shannon limit performance of low density parity check codes. IEE Electronics Letters, 1996, 32(18): 1645-1646.
3Lin S, Costello D J. Error Control Coding. 2nd Ed., Prentice Hall, 2004.
4Tanner R. A recursive approach to low complexity codes. IEEE Transactions on Information Theory, 1981, 27(5): 533- 547.
5Quaglio F, Vacca F, Castellano C, Tarable A, Masera G. Interconnection framework for high-throughput, flexible LDPC decoders. In Proc. Design, Automation and Test in Europe (DATE2006), Munich, Germany, March 6-10, 2006, pp.124- 129.
6Ping L, Leung W K. Decoding low density parity check codes with finite quantization bits. IEEE Communications Letters, 2000, 4(2): 62-64.
7Zhang T, Parhi K. Joint (3, k)-regular LDPC code and decoder/encoder design. IEEE Transactions on Signal Processing, 2004, 52(4): 1065-1079.
8Verdier F, Declercq D. A low-cost parallel scalable FPGA architecture for regular and irregular LDPC decoding. IEEE Transactions on Communications, 2006, 54(7): 1215-1223.
9Falcao G, Gomes M, Gonqalves J, Faia P, Silva V. HDL library of processing units for an automatic LDPC decoder design. In Proc. IEEE Ph.D. Research in Microelectronics and Electronics (PRIMB), Otranto, Italy, June 11-16, 2006, pp.349-352.
10Comes M, Silva V, Neves C, Marques R. Serial LDPC decoding on a SIMD DSP using horizontal-scheduling. In Proc. 14th European Signal Processing Conference (EUSIPCO2006), Florence, Italy, Sept. 4 8, 2006.

共引文献1

1王锋,杨灿群,杜云飞,陈娟,易会战,徐炜遐.Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer[J].Journal of Computer Science & Technology,2011,26(5):854-865. 被引量：3

1陈仁元.基于GPGPU的SAR并行成像处理技术[J].科技信息,2013(8):320-320.
2Mike Strickland.FPGA协处理的进展[J].今日电子,2010(4):29-31. 被引量：1
3李玉峰,吴蔚,王恺,崔迎炜.基于GPGPU的JPEG2000图像压缩方法[J].电子器件,2013,36(2):163-168. 被引量：5
4自适应雷达和电子战系统的设计和开发工具[J].国际电子战,2013,0(4):18-18.
5张朝晖,於建生,薛钰娟,徐勤建.基于GPGPU的准实时测频技术[J].雷达科学与技术,2011,9(2):183-187.
6孙成明,夏春平.CDMA塔顶放大器工程设计和安装测试[J].移动通信,2006,30(6):96-100.
7周斌,叶春茂,李文雯,宋苗苗.基于通用图形处理器的大规模Costas信号脉压处理[J].山东科学,2012,25(6):47-53.
8史鸿声.雷达信息处理系统现状及发展趋势[J].雷达与对抗,2011,31(3):14-17. 被引量：3
9段俊毅.基于频移最小化的天线选择算法[J].长沙通信职业技术学院学报,2012,11(3):34-38.
10李刚,黑勇,刘志国,仇玉林.WiMAX中多码率LDPC解码器的设计与实现[J].电视技术,2008,32(2):59-61. 被引量：2

中国科技论文

2013年第7期

浏览历史

内容加载中请稍等...

基于GPGPU的LDPC解码访存优化技术

参考文献1

二级参考文献30

共引文献1

相关作者

相关机构

相关主题

浏览历史