期刊文献+

利用循环分割和循环展开避免Cache代价 被引量:2

Optimization to Prevent Cache Penalty by Loop Partition and Loop Unrolling
下载PDF
导出
摘要 存储系统与处理器之间的速度差距逐渐变大,为此,cache使用了分级机制,但这也带来了额外的存储延迟(cache代价).提出一种利用循环分割和循环展开相结合避免cache代价的PCPLPU(prevent cache penalty by loop partition-unrolling)算法.实验结果表明,PCPLPU算法能够有效避免循环代价,提高程序性能. Due to the increasing speed gap between memory system and processor, cache hierarchies have been implemented into memory system, but additional latency (cache penalty) is introduced. This paper presents an algorithm named as prevent cache penalty by loop partition-unrolling (PCPLPU), which can prevent cache penalty in loops by the combination of loop partition and unrolling. Experimental results show that PCPLPU can prevent cache penalty and improve the performance of programs.
出处 《软件学报》 EI CSCD 北大核心 2008年第9期2228-2242,共15页 Journal of Software
基金 国家自然科学基金~~
关键词 循环分割 循环展开 cache代价 bank冲突 loop partition loop unrolling cache penalty bank conflict
  • 相关文献

参考文献3

二级参考文献34

  • 1刘利,李文龙,陈彧,李胜梅,汤志忠.软件流水中隐藏存储延迟的方法[J].软件学报,2005,16(10):1833-1841. 被引量:6
  • 2Allen V H, Jones R B, Lee R M, et al . Software pipelining [J]. ACM Computing Surveys, 1995,27(3):367-432
  • 3Weiss S, Smith J E. A study of scalar compilation techniques for pipelined supercomputers[J]. ACM Transactions on Mathematical Software, 1990, 16(3):223-245
  • 4Rau B R. Iterative modulo scheduling[R]. HPL-94-115, 1994
  • 5Mowry T C, Lam M S, Gupta A. Design and evaluation of a compiler algorithm for prefetching[A]. In: Proceeding of the Fifth International Conference on Architectural Support for Programming Languages and Operating Systems [C]. Massachusetts: ACM Press, 199
  • 6Roy J, Sun C, Wu C Y. Open research compiler for itanium processor family(IPF) [A]. In:MICRO-34 Tutorial [C]. Texas: ACM Press, 2001
  • 7Intel Corporation. Intel IA-64 architecture software developer's manual. Volume 3: Instruction set reference [M]. Intel Corp, 2000
  • 8Intel Corporation. Intel IA-64 architecture software developer's manual. Volume 1: IA-64 application architecture [M]. Intel Corp, 2000
  • 9Sanchez F, Cortadella J,Badia R M. Optimal exploration of the unrolling degree for software pipelining [R]. UPC-DAC-1996-41, 1996
  • 10Vivek Sarkar. Optimized unrolling of nested loops [A]. In: Proceedings of the 14th International Conference on Supercomputing[C]. New Mexico: ACM Press, 2000. 153-166

共引文献20

同被引文献8

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部