多级缓存数据预取处理器访存性能测试方法

Testing Methods for Memory Access Performance of Multi-level Cache Data Prefetching Processor

下载PDF

导出

摘要针对处理器内存访问性能测试缺少对多级缓存数据预取优化而导致测试数据不能真实反映实际性能的问题,分析了多级缓存数据预取优化技术及其对内存访问带宽的影响。提出了一种针对多级缓存处理器的访存性能优化测试方法,该方法充分利用缓存数据预取机制,并避免处理器核间资源竞争,实现访存性能提升。实验数据表明,采用该方法可以得到符合硬件实际访存性能的数据,为准确评估高性能处理器的访存能力提供支持。 In response to the issue of lack of optimization for multi-level cache data prefetching in processor memory access performance testing,which often results in test data not reflecting actual performance,this paper analyzes the optimization technology for multi-level cache data prefetching and its impact on memory access bandwidth.A memory access performance optimization testing method for multi-level cache processors is proposed,which fully utilizes the cache data prefetching mechanism and avoids resource competition between processor cores,achieving improved memory access performance.Experimental verification was conducted,and the experimental data showed that using this method can obtain data that matches the actual memory access performance of hardware,providing support for accurately evaluating the memory access capability of high-performance processors.

作者钟伟军田晨燕

机构地区中国电子技术标准化研究院

出处《信息技术与标准化》 2023年第6期25-29,共5页 Information Technology & Standardization

关键词多级缓存缓存数据预取访存性能处理器访存带宽 multi-level cache cache data prefetching memory access performance processor memory bandwidth

分类号 TP3 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

参考文献2

1迟利华,胡庆丰,刘杰,甘新标,蒋杰,晏益慧.面向FT1000微处理器的STREAM并行计算与优化[J].计算机工程与科学,2014,36(12):2267-2271. 被引量：4
2贾迅,胡向东,尹飞.申威处理器硬件数据预取技术的实现[J].计算机工程与科学,2015,37(11):2013-2017. 被引量：6

二级参考文献20

1http://www.nscc-tj.gov.cn/resources/resource_1.asp.
2McCalpin J D.Stream:Sustainable memory bandwidth in high performance computers[EB/OL].[2013-05-16].http://www.cs.virginia.edu/stream/.
3Gong Chun-ye,Liu Jie,Chi Li-hua,et al.GPU accelerated simulations of 3Ddeterministic particle transport using discrete ordinates method[J].Journal of Computational Physics,2011,230(15):6010-6022.
4Petrini F,Fossum G,Fernandez J,et al.Multicore surprise lessons learned from optimizing sweep3Don the cell broadband engine[C]∥Proc of International Parallel and Distributed Processing Symposim,2007:1-10.
5Gan Xin-biao,Wang Zhi-ying,Shen Li,et al.ab-Stream:A framework for programming many-core[J].Electrical Review,2012,88(7b):341-344.
6Molka D,Hackenberg D,Schone R,et al.Memory performance and cache coherency effects on an Intel Nehalem multiprocessor system[C]∥Proc of the 18th International Conference on Parallel Architectures and Compilation Techniques,2009:261-270.
7Preeti R P,Hiroshi N.Augmenting loop tiling with data alignment for improved cache performance[J].IEEE Transactions on Computers,1999,48(2):142-149.
8Fraboulet A,Kodary K,Mignotte A.Loop fusion for memory space optimization[C]∥Proc of IEEE International Symposium on System Synthesis,2001:95-100.
9Alvin R,Chatterjee L S,Praveen K,et al.Recursive array layouts and fast matrix multiplication[J].IEEE Transactions on Parallel and Distributed Systems,2002,13(11):1105-1123.
10Pike G,Hilnger P N.Better tiling and array contraction for compiling scientic programs[C]∥Proc of the IEEE/ACM Conference on Supercomputing,2002:1-12.

共引文献8

1明旭,何慧文,陈磊.DPDK在国产申威处理器平台上的应用与研究[J].信息安全研究,2018,4(1):53-62. 被引量：4
2赵东阳,刘瑞,孟英谦.NUMA架构的龙芯3A板级设计及工程化技术研究[J].计算机工程与应用,2017,53(8):260-266. 被引量：1
3王锦涵,李俊,路冬冬,张海龙,朱英.基于双倍步长数据流的硬件预取机制[J].计算机工程,2019,45(6):115-118. 被引量：1
4张晔嘉.国内信息系统自主可控生态环境分析[J].电子质量,2019,0(7):58-61. 被引量：10
5王竞争.基于国产申威处理器的云计算资源管理平台设计与实现[J].粘接,2021(4):82-86. 被引量：3
6高延海,樊茂.申威421应用系统JTAG调试工具设计[J].舰船电子工程,2021,41(12):113-118. 被引量：1
7李竞择,苟喜东,范承宇.基于FT2000处理器内存性能测试及优化[J].机电产品开发与创新,2022,35(3):110-113. 被引量：3
8刘建,黄奇,官慧敏,张明娟,吴宸.基于STREAM的内存性能测试设计与分析[J].电子质量,2024(5):81-86.

1王宁,王一晗,姜凤良,姜朋涛,胡志芳.记忆性T细胞亚群及其分化调控研究进展[J].中国免疫学杂志,2023,39(6):1326-1330.
2彭沈莉.论南朝文人自然审美观新变之佛学渊源[J].三峡大学学报（人文社会科学版）,2023,45(4):84-89.

信息技术与标准化

2023年第6期

浏览历史

内容加载中请稍等...

多级缓存数据预取处理器访存性能测试方法

参考文献2

二级参考文献20

共引文献8

相关作者

相关机构

相关主题

浏览历史