期刊文献+

向量DSP的混合资源启发式循环展开因子选择方法研究

Study on Hybrid Resource Heuristic Loop Unrolling Factor Selection Method Based on Vector DSP
下载PDF
导出
摘要 在现代处理器中,具有向量处理单元的VLIW体系结构逐渐成为高性能DSP体系结构的典型代表。基于这类体系结构的寄存器资源丰富、执行单元多等特点,研究了相应的循环展开因子选择问题,提出了一种循环展开因子选择方法来提升循环展开这种重要优化的效果。该方法考虑了循环体代码的向量标量属性、基址寄存器和索引寄存器资源使用规则等因素,并且在展开因子选择算法中增加了执行单元使用占比和展开因子按幂次对齐这两种启发式因素。针对3种常用数字信息处理算法开展了实验研究,实验结果表明了该方法的有效性。对于这三种DSP算法,用所提方法获得的循环展开因子进行循环展开处理后,它们的平均性能相比已有方法提升了10%以上。 For modern microprocessors,the very long instruction word(VLIW)architecture integrating vector processing units has gradually become a typical representative of high-performance digital signal processor(DSP)architectures.This architecture is mainly characterized by rich register resources and many instruction execution units.Based on these characteristics,a selection method for the corresponding loop unrolling factor is proposed to improve the effect of loop unrolling optimization.This method takes into account the vector or scalar attribute of the code in a loop body,and the usage rules of base address registers and index registers.Moreover,another two heuristics,i.e.,the proportion of the times that the execution units are used and the power alignment of unrolling factor,are used in the loop unrolling factor selection algorithm.The ability of this method in developing more instruction level parallelism is proved by experiments performed on three commonly used digital signal processing algorithms.Experiment results show that the average performance of the algorithms improves by more than 10%compared with the existing methods.In particular,experiments on FFT algorithm show that the proposed method can analyze the usage of related hardware resources more accurately through the hybrid resource heuristics,and makes the judgment of unrolling and obtains the corresponding value of loop unrolling factor.
作者 陆浩松 胡勇华 王书盈 周新莲 李慧祥 LU Hao-song;HU Yong-hua;WANG Shu-ying;ZHOU Xin-lian;LI Hui-xiang(School of Computer Science and Engineering,Hunan University of Science and Technology,Xiangtan,Hunan 411201,China)
出处 《计算机科学》 CSCD 北大核心 2022年第S01期777-783,共7页 Computer Science
基金 湖南省教育厅科研项目(20B242,19A169) 湖南省自然科学基金(2017JJ3087) 国家自然科学基金(61872138)。
关键词 循环展开 展开因子 超长指令字 向量DSP 编译优化 Loop unrolling Unrolling factor VLIW Vector DSP Compiler optimization
  • 相关文献

参考文献5

二级参考文献44

  • 1Mei Wen Nan Wu Hai-Yan Li Chun-Yuan Zhang.Multiple-Morphs Adaptive Stream Architecture[J].Journal of Computer Science & Technology,2005,20(5):635-646. 被引量:3
  • 2赵常智,刘春林,胡定磊,陈书明.一种支持SIMD指令的表驱动的代码选择技术[J].计算机应用研究,2006,23(6):45-48. 被引量:2
  • 3SIMD[EB/OL]. en. wikipedia, org/wiki/SIMD.
  • 4Cheng G, Lain M. An optimizer for multimedia instruction sets[R]. Proceedings of the 2nd SUIF Compiler Workshop, Stan{ord University, 1997.
  • 5Krall A, Lelait S. Compilation techniques for multimedia processors [ J]. International Journal of Parallel Programming, 2000, 28(4) .. 347 361.
  • 6Wu P, gichenberger A E, Wang A. Efficient SIMI) code generation for runtime align ment and length conversion [C]// Proceedings of the International Symposium on Code Generation and Optimization. Los Alamitos, USA: IEEE Press, 2005:153 164.
  • 7Fraser C W, Hanson D R, Proebsting T A. Engineering a simple, efficient code-generator generator [J]. ACM Letters on Programming Languages and Systems, 1992, 1(3): 213-226.
  • 8Larsen S, Amarasinghe S. Exploiting superword level parallelism with multimedia instruction sets [ C]// Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. New York, USA: ACM Press, 2000: 145-156.
  • 9Hohenauer M, Engel F, Leupers R, et al. A SIMD optimization framework for retargetable compilcrs[J].ACM Transactions on Architecture and Code Optimization, 2009, 6(1): 1-27.
  • 10Hwu W W. The IMPACT Research Group[EB/OL]. http://impact, crhc. illinois, edu/.

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部