摘要
超字级并行(superword level parallelism,SLP)是一种面向处理器单指令多数据(single instruction multiple data,SIMD)扩展部件实现程序自动向量化的方法,这种方法被广泛应用于主流编译器中.SLP方法有赖于先找到同构指令序列再对之进行自动向量化.将非同构指令序列等价转为同构指令序列以扩展SLP方法的适用范围是当前研究趋势之一.提出SLP的一种扩展方法──SLP-M向量化方法,引入二元表达式替换同构转换方式,基于条件判断和收益计算的选择,利用多种指令序列同构化转换,将满足特定条件的非同构指令序列转换为同构指令序列,再进一步实施自动向量化,从而提升SLP的适用范围和收益.在LLVM中实现了SLP-M方法,并利用SPEC CPU 2017等标准测试集进行了测试评估.实验结果表明,SLPM方法相比于已有方法在核心函数测试中性能提升了21.8%,在基准测试程序整体测试中性能提升了4.1%.
SLP(superword level parallelism)is an efficient auto-vectorization method to exploit the data level parallelism for basic block,oriented to SIMD(single instruction multiple data),and SLP has been widely used in the mainstream compilers.SLP performs vectorization by finding multiple sequences of isomorphic instructions in the same basic block.Recently there is a research trend that the compilers translate the sequences of non-isomorphisic instructions into the sequences of isomorphisic instructions to extend application scope of the SLP vectorization method.In this paper,we introduce SLP-M,a novel auto-vectorization method that can effectively vectorize the code containing sequences of non-isomorphic instructions in the same basic block,translatting the code into isomorphic form by selection and conduction of multiple transformation methods based on condition judgment and benefit evaluation.A new transformation method for binary expression replacement is also proposed.SLP-M improves application scope and performance benefit for SLP.We implement SLP-M in LLVM.A set of applications are taken from some benchmarks such as SPEC CPU 2017 to compare our approach and prior techniques.The experiments show that,compared with the existing methods,the performance of SLP-M improves by 21.8%on kernel functions,and improves by 4.1%in the overall tests of the benchmarks.
作者
冯竞舸
贺也平
陶秋铭
马恒太
Feng Jingge;He Yeping;Tao Qiuming;Ma Hengtai(National Engineering Research Center for Fundamental Software(Institute of Software,Chinese Academy of Sciences),Beijing 100190;(State Key Laboratory of Computer Science(Institute of Software,Chinese Academy of Sciences),Beijing 100190))
出处
《计算机研究与发展》
EI
CSCD
北大核心
2023年第12期2907-2927,共21页
Journal of Computer Research and Development
基金
中国科学院战略性先导科技专项(XDA-Y01-01,XDC02010600)。
关键词
SIMD扩展
自动向量化
超字级并行
非同构指令序列
同构化变换
SIMD extension
auto-vectorization
superword level parallelism(SLP)
sequence of non-isomorphism instructions
isomorphic transformation