期刊文献+

PARBLO:Page-Allocation-Based DRAM Row Buffer Locality Optimization 被引量:2

PARBLO:Page-Allocation-Based DRAM Row Buffer Locality Optimization
原文传递
导出
摘要 DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizations to eliminate significantly more row buffer conflicts. Validation in simulation using a set of selected scientific and engineering benchmarks against a few representative memory controller optimizations shows that our method can reduce row buffer miss rates by up to 76% (with an average of 37.4%). This reduction in row buffer miss rates will be translated into performance speedups by up to 15% (with an average of 5%). DRAM row buffer conflicts can increase memory access latency significantly. This paper presents a new pageallocation-based optimization that works seamlessly together with some existing hardware and software optimizations to eliminate significantly more row buffer conflicts. Validation in simulation using a set of selected scientific and engineering benchmarks against a few representative memory controller optimizations shows that our method can reduce row buffer miss rates by up to 76% (with an average of 37.4%). This reduction in row buffer miss rates will be translated into performance speedups by up to 15% (with an average of 5%).
出处 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第6期1086-1097,共12页 计算机科学技术学报(英文版)
基金 Supported by the National Basic Research 973 Program of China under Grant No. 2005CB321602 the National Natural Science Foundation of China under Grant No. 60736012
关键词 DRAM row buffer page allocation locality optimization DRAM, row buffer, page allocation, locality optimization
  • 相关文献

参考文献32

  • 1McKee S A, Wulf W A, Aylor J H et al. Dynamic access ordering for streamed computations. IEEE Trans. Computers, 2000, 49(11): 1255-1271.
  • 2Rixner S, Dally W J, Kapasi U J, Mattson P R, Owens J D. Memory access scheduling. In Proc. ISCA 2000, Vancouver, Canada, June 10-14, pp.128-138.
  • 3Scott Rixner. Memory controller optimizations for Web servers. In Proc. MICRO 2004, Portland, USA, Dec. 4-8, pp.355-366.
  • 4Shao J, Davis B T. A burst scheduling access reordering mechanism. In Proc. HPCA 2007, Phoenix, USA, Feb. 10-14, 2007, pp.285-294.
  • 5Zhang Z, Zhu Z, Zhang X. A permutation-based page interleaving scheme to reduce row-buffer conflicts and exploit data locality. In Proc. MICRO 2000, Montery, USA, Dec. 10-13, 2000, pp.32-41.
  • 6Lin W F, Reinhardt S K, Burger D. Reducing DRAM latencies with an integrated memory hierarchy design. In Proc. HPCA 2001, Nuevo Leone, Mexico, Jan. 20-24, pp.301-312.
  • 7Shin J, Chame J, Hall M W. A compiler algorithm for exploiting page-mode memory access in embedded-DRAM devices. In Proc. the 4th Workshop on Media and Streaming Processors, Istanbul, Turkey, Nov. 18-19, November 2002.
  • 8Ding C, Kennedy K. Improving effective bandwidth through compiler enhancement of global cache reuse. In Proc. IPDPS 2001, San Francisco, USA, April 23-27, 2001, p.38.
  • 9Jacob B, Ng S W, Wang D T. With Contributions by Samuel Rodriguez, Memory Systems: Cache, DRAM, Disk. ISBN 978-0-12-379751-3, Morgan Kaufmann Publishers, September 2007.
  • 10Mutlu O, Moscibroda T. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared DRAM systems. In Proc. ISCA 2008, Beijing, China, June 21-25, 2008, pp.63-74.

同被引文献51

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部