摘要
文章[1]中提出了数组之间的数据融合优化方法,并以IA-32服务器为平台测试了数据融合优化的效果。测试结果表明,在IA-32机器上,数据融合优化在性能代价模型的控制下,能较好地改善具有非连续数据访问特征的应用程序的CACHE利用率。那么,在新一代体系结构IA-64平台上,数据融合优化的效果如何呢?该文分别以IntelIA-32服务器和HPITANIUM服务器为平台,用IntelFORTRAN编译器ifc和efc及自由软件编译器g95分别编译并运行数据融合优化变换前后的程序,获得两种平台上的执行时间及相关的性能数据。测试结果表明,源程序级的数据融合优化不能很好地与IA-64平台上的EFC编译器高级优化配合工作,在O3级优化开关控制下,优化效果是负值。此测试结果进一步表明,编译高级优化如数据预取、循环变换和数据变换等各种优化必须结合体系结构的特点统筹考虑,才能取得好的全局优化效果。该文为研究各种面向IA-32体系结构的编译优化算法在IA-64体系结构上的性能可移植性优化起到抛砖引玉的作用。
Data fusion based approach is presented to improve data locality in paper1.The evaluation results under the control of certain performance-cost model show that data fusion can improve the performance of applications with non-continuous data access pattern on IA-32 computers.However,what about it on IA-64 computers﹖This paper uses Intel ifc(IA-32) compiler,Intel efc(IA-64) compiler and g95(GNU Fortran 95) compiler to compile the original source code and the optimized source code,and runs the executable files on Intel IA-32 computer and IA-64 computer respectively.The results show that data fusion optimization can not cooperate effectively with the high level optimizations such as loop transformation of efc compiler on IA-64 computer.When the testing program is compiled with efc-O3,the execute time of the optimized program is on the contrary longer than that of the non-optimized program.The results also show that the high compiler optimizations such as data prefetch,loop transformation and data transformation must be considered synthetically,integrating with the characteristic of the underlying IA-64 micro-architecture.
出处
《计算机工程与应用》
CSCD
北大核心
2005年第15期1-4,16,共5页
Computer Engineering and Applications
基金
国家863高技术研究发展计划基金(编号:2002AA1Z2101
2004AA1Z2210)