期刊文献+

多级缓存数据预取处理器访存性能测试方法

Testing Methods for Memory Access Performance of Multi-level Cache Data Prefetching Processor
下载PDF
导出
摘要 针对处理器内存访问性能测试缺少对多级缓存数据预取优化而导致测试数据不能真实反映实际性能的问题,分析了多级缓存数据预取优化技术及其对内存访问带宽的影响。提出了一种针对多级缓存处理器的访存性能优化测试方法,该方法充分利用缓存数据预取机制,并避免处理器核间资源竞争,实现访存性能提升。实验数据表明,采用该方法可以得到符合硬件实际访存性能的数据,为准确评估高性能处理器的访存能力提供支持。 In response to the issue of lack of optimization for multi-level cache data prefetching in processor memory access performance testing,which often results in test data not reflecting actual performance,this paper analyzes the optimization technology for multi-level cache data prefetching and its impact on memory access bandwidth.A memory access performance optimization testing method for multi-level cache processors is proposed,which fully utilizes the cache data prefetching mechanism and avoids resource competition between processor cores,achieving improved memory access performance.Experimental verification was conducted,and the experimental data showed that using this method can obtain data that matches the actual memory access performance of hardware,providing support for accurately evaluating the memory access capability of high-performance processors.
出处 《信息技术与标准化》 2023年第6期25-29,共5页 Information Technology & Standardization
关键词 多级缓存 缓存数据预取 访存性能 处理器 访存带宽 multi-level cache cache data prefetching memory access performance processor memory bandwidth
  • 相关文献

参考文献2

二级参考文献20

  • 1http://www.nscc-tj.gov.cn/resources/resource_1.asp.
  • 2McCalpin J D.Stream:Sustainable memory bandwidth in high performance computers[EB/OL].[2013-05-16].http://www.cs.virginia.edu/stream/.
  • 3Gong Chun-ye,Liu Jie,Chi Li-hua,et al.GPU accelerated simulations of 3Ddeterministic particle transport using discrete ordinates method[J].Journal of Computational Physics,2011,230(15):6010-6022.
  • 4Petrini F,Fossum G,Fernandez J,et al.Multicore surprise lessons learned from optimizing sweep3Don the cell broadband engine[C]∥Proc of International Parallel and Distributed Processing Symposim,2007:1-10.
  • 5Gan Xin-biao,Wang Zhi-ying,Shen Li,et al.ab-Stream:A framework for programming many-core[J].Electrical Review,2012,88(7b):341-344.
  • 6Molka D,Hackenberg D,Schone R,et al.Memory performance and cache coherency effects on an Intel Nehalem multiprocessor system[C]∥Proc of the 18th International Conference on Parallel Architectures and Compilation Techniques,2009:261-270.
  • 7Preeti R P,Hiroshi N.Augmenting loop tiling with data alignment for improved cache performance[J].IEEE Transactions on Computers,1999,48(2):142-149.
  • 8Fraboulet A,Kodary K,Mignotte A.Loop fusion for memory space optimization[C]∥Proc of IEEE International Symposium on System Synthesis,2001:95-100.
  • 9Alvin R,Chatterjee L S,Praveen K,et al.Recursive array layouts and fast matrix multiplication[J].IEEE Transactions on Parallel and Distributed Systems,2002,13(11):1105-1123.
  • 10Pike G,Hilnger P N.Better tiling and array contraction for compiling scientic programs[C]∥Proc of the IEEE/ACM Conference on Supercomputing,2002:1-12.

共引文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部