期刊文献+

基于动态规划的自动向量化方法 被引量:1

Auto-Vectorization Method Based on Dynamic Programming
下载PDF
导出
摘要 由于SLP自动向量化算法使用的启发式策略会丢失一定的向量化机会,本文提出一种基于动态规划的自动向量化方法DPSLP,该方法采用比SLP更加激进的策略在基本块内寻找候选的SIMD指令分组,依据动态规划方程计算指令分组的代价并从众多指令分组中选择最优的分组进行向量化转换.实验结果显示,DPSLP与SLP相比测试程序的运行时间平均减少了8%,静态指令代价平均减少10%,平均向量宽度增加66.4%. As SLP(super-word level parallelism)auto-vectorization algorithm will lose some vectorization opportunities by using heuristic strategy,an auto-vectorization method named DPSLP that based on dynamic programming was proposed in this paper.In this method,the candidate statement groups were searched for SIMD(single instruction multiple data)instruction by using more aggressive strategy than SLP,and the optimal statement groups were selected to vectorize according to the cost of which calculated by dynamic programming formula.Experimental result show that DPSLP achieves on average a total decrease of 8% in execution time,10%in static instruction cost and increase of 66.4%in vector width,compared with SLP.
出处 《北京理工大学学报》 EI CAS CSCD 北大核心 2017年第5期544-550,共7页 Transactions of Beijing Institute of Technology
基金 国家部委重大专项基金资助项目(2014ZX01020-003) 国家自然科学基金资助项目(61136002)
关键词 自动向量化 动态规划 指令代价 auto-vectorization dynamic programming instruction cost
  • 相关文献

参考文献3

二级参考文献29

  • 1Stewart J. An investigation of SIMD instruction sets. University of Ballarat School of Information Technology and Mathematical Sciences, 2005. http://noisymime.org/blogimages/SIMD.pdf.
  • 2Nuzman D, Rosen I, Zaks A. Auto-Vectorization of interleaved data for SIMD, In: Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation. Ottawa: ACM Press, 2006. 132-143. [doi: 10.1145/1133981.1133996].
  • 3Zheng WM, Tang ZZ. Compiler Archtecture. Beijing: Tsinghua University Press, 1998 (in Chinese).
  • 4Allen R, Kennedy K. Optimizing Compilers for Modern Architectures--A Dependence-Based Approach. San Francisco: Morgan Kaufmann Publishers, 2001.
  • 5Shen ZY, Hu ZA, Liao XK, Wu HP, Zhao KJ, Lu YT. Methods of Parallel Compilation. Beijing: National Defence Industry Press, 2000 (in Chinese).
  • 6Bik AJC. The Software Vectorization Handbook--Applying Multimedia Extensions for Maximum Performance. Intel Press, 2004.
  • 7Hampton M, Asanovic K. Compiling for vector-thread architectures. In: Proc. of the 6th Annual IEEE/ACM Int'l Symp. on Code Generation and Optimization. Boston: ACM Press, 2008.205-215. [doi: 10.1145/1356058.1356085].
  • 8Naishlos D, Biberstein M, Ben-David S, Zaks A. Vectorizing for a SIMdD DSP architecture. In: Proc. of the 2003 Int'l ConL on Compilers, Architecture and Synthesis for Embedded Systems. San Jose: ACM Press, 2003.2-11. [doi: 10.1145/951710.951714].
  • 9Bik AJC, GirKar M, Grey PM, Tian XM. Automatic intra-register vectorization for the Intel architecture. Int'l Journal of Parallel Programming, 2002,30(2):65-98. [doi: 10.1023/A:1014230429447].
  • 10Wu P, Eichenberger AE, Wang A, Zhao P. An integrated simdization framework using virtual vectors. In: Proc. of the 19th Annual Int'l Conf. on Supercomputing. Cambridge: ACM Press, 2005. 169-178. [doi: 10.1145/1088149.1088172].

共引文献43

同被引文献8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部