
基于双倍步长数据流的硬件预取机制 被引量:1

Hardware Prefetching Mechanism Based on Double Step Data Stream
摘要 硬件数据预取技术可以有效提升处理器的访存性能,但传统流预取策略存在预取不及时的问题。为此,提出一种双倍步长流预取策略,并设计对应的预取部件结构。预取部件自动检测数据流的固定步长并将该步长扩大为原有的2倍,以计算预取地址。实验结果表明,加入该预取部件后,运行SPEC2006测试集的整数应用与浮点应用时,处理器性能最高可分别提升45%与57%,针对Cache Miss率较高的应用,该预取部件可以有效隐藏访存延时。 Hardware data prefetching technology can effectively improve the memory access performance of processors,but the traditional stream prefetching strategy has the problem of untimely prefetching.Therefore,a double step stream prefetching strategy is proposed,and the corresponding prefetching component structure is designed.The prefetching component automatically detects the fixed step size of the data stream and enlarges the step size to twice of the original one to calculate the prefetching address.Experimental results show that the performance of the processor can be improved by 45% and 57% respectively when SPEC2006 test set integer application and floating-point application are run with the prefetching component.For applications with high Cache Miss rate,the prefetch component can effectively hide the memory access latency.
作者 王锦涵 李俊 路冬冬 张海龙 朱英 WANG Jinhan;LI Jun;LU Dongdong;ZHANG Hailong;ZHU Ying(Shanghai High Performance IC Design Center,Shanghai 201204,China)
出处 《计算机工程》 CAS CSCD 北大核心 2019年第6期115-118,126,共5页 Computer Engineering
基金 核高基重大专项“超级计算机处理器研制”(20172X01028101-001)
关键词 硬件预取 双倍步长 流预取 SPEC2006测试集 CacheMiss率 hardware prefetching double step stream prefetching SPEC2006 test set Cache Miss rate
  • 相关文献



  • 1廖秋林,莫玮,陈大为.SPEC CPU2000性能测试程序分析及其应用[J].国外电子测量技术,2006,25(6):65-68. 被引量:7
  • 2郇丹丹,李祖松,胡伟武,刘志勇.结合访存失效队列状态的预取策略[J].计算机学报,2007,30(7):1104-1114. 被引量:3
  • 3Intel Corporation. Intel 64 and IA-32 architecture optimiza- tion reference manual [ EB/OL ]. [2014-12-05]. http:// www. intel, com/eontent//www /us/en/documents/manu- als/64-ia-32-architeetures-optimization-manual, pdf.
  • 4Tendler J M,Dodson J S,Fields J S,et al. POWER4 system microarchitecture[J]. IBM Journal of Research and Develop- ment,2002,46(1) :5-25.
  • 5LOONGSON Technology Corporation Limited. User mannu- al for Loongson 3B1500 microprocessor[EB/OL]. [2014-12- 05]. https://www, loongsin, en. (in Chinese).
  • 6Jouppi N P. Improving direct-mapped Cache performance by the addition of small fully-associative cache and prefetching buffers[C]//Proc of the 17th Annual International Symposi- um on Computer Architecture, 1990 : 364-373.
  • 7Chen T F,Baer J L. Effective hardware-based data prefetch- ing for high-performance processors[J]. IEEE Transactions on Computers, 1995,44(5) : 609-623.
  • 8Joseph D,Grunwald D. Prefetching using Markov predictors [J]. IEEE Transactions on Computer, 1999,48(2) :121-133.
  • 9Roth A,Moshovos A,Sohi G S. Dependence based prefetch- ing for linked data structures[C]// Proc of the 8th Interna- tional Conference on Architectural Support for Programming Language and Operating Systems, 1998:115-126.
  • 10Cadence Design Systems, Inc. Cadence Palladium XP verification computing platform [ EB/OL]. [-2014-12-05]. www. cadence. corn/rl/resource/technical briefs/palladium xp_tb, pdf.












使用帮助 返回顶部