期刊文献+

一种多倍数据供应的编译优化方法

Compiler Optimization Method Based on Multiple Data Supply
下载PDF
导出
摘要 数据的快速及时供应对访存密集型程序的性能有着直接的影响.提出一种多倍数据供应MDS(Multiple Data Supply)的编译优化方法,在不增加处理器设计复杂度的前提下,利用现有处理器的高带宽,一次对内存进行多个数据的读写,减少访存次数,提高应用程序效率.在编译优化阶段,利用自动向量化技术,生成向量形式的树结构,增加一条新的扩展路径来处理从向量化的树结构到底层结构的扩展.针对向量化后树结构的多样性问题,设计新的优化遍以及RAC(Register Assignment Chain)替换算法进行专门处理.在龙芯3A处理器平台上,对SPEC-CPU2000的测试,CINT程序平均性能提升11.6%,CFP程序平均性能提升14.4%. The rapid and timely supply of data has a direct impact on memory intensive application. This paper presents a MDS ( Multipie Data Supply) compiler opfmization method without increasing the complexity of processor design. Taking advantage of highbandwidth of existing processor, MDS mechanism reads or writes multiple data at once, reducing reading and writing numbers, improving program efficiency. At the stage of compiler optimization, MDS mechanism will convert source code to vector tree structure form and expand it to lower-level representation with a new path. According to the diversity of tree structure, this article designs new optimization pass and RAC ( Register Assignment Chain ) replacement algorithm to deal with. The experimental result for SPECCPU2000 show 11.6% improvement of integer benchmarks and 14.4% improvement of floating-point benchmarks on Godson-3A platform.
出处 《小型微型计算机系统》 CSCD 北大核心 2011年第11期2280-2284,共5页 Journal of Chinese Computer Systems
基金 国家"核高基"重大专项课题项目(2009ZX01028-002-003-005)资助 国家自然科学基金项目(60833004)资助
关键词 编译优化 MDS多倍数据供应 自动向量化 RAC替换算法 龙芯3A compiler optimization MDS auto-vectorization RAC Godson-3 A
  • 相关文献

参考文献15

  • 1Naishlos D. Autovectorization in GCC[ C]. In Proceedings of the 2004 GCC Developers' Summit, 2004:105-118,.
  • 2Steven P Vanderwiel David J. Lilja. Data pmfetch raechanisms [ J]. ACM Computing Surveys (CSUR) ,2000,32(2) : 174-199.
  • 3David Nassimi, Sartaj Sahni. Data broadcasting in SIMD comput- ers[J]. IEEE Transactions on Computers, February, 1981,30 (2) :101-107.
  • 4Diego Novillo. Tree SSA: a new optimization infraslructure for GCC[C]. In Proceedings of the 2003 GC'C Developezs' Summit, 2003 : 181-193.
  • 5Richard M. Stallman and the C, CC developer community [ Z]. GNU Compiler Collection Internals ( For GCC version 4.4.0 ). Free Software Foundation, Inc. 2010.
  • 6Tang Dan, Bao Yun-gang, Hu Wei-wu, et al. DMA cache: using on-chip storage to architecturally separate//0 data from CPU datafor improving I/O performance[ C ]. High Performance Computer Architecture (HPCA), 2010 IEEE 16th Intcrnati0nal Symposium, 2010:9-14.
  • 7Diego Novillo. Design and implementation of tree SSA[ C ]. In Proceedings of the 2004 GCC Developers' Summit, 2004:119-130,.
  • 8Nuzman D, Zaks A. Autovectorization in GCC-two years later[ C ]. In Proceedings of the 2006 GCC Developers' Summit, 2006:145-158.
  • 9Hoseok Chang, Wonyong Sung. Efficient vcctorization of SIMD programs with non-aligned and irregular data access hardware[ C]. In Proceding of 2008 International Conference on Compilcrs, Ar- chitecture and Synthesis for Embedded Systems, 2008.
  • 10MIPS Teclmologies[ R]. Inc. MIPS64 Architecture for Program- mers, Volume H: The MIPS64 Instruction Set, July 1, 2005.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部