期刊文献+

基于节点预测的直接Cache一致性协议 被引量:33

Node Predicting Based Direct Cache Coherence Protocol for Chip Multi-processor
下载PDF
导出
摘要 处理器性能的提升依赖于对存储系统性能的挖掘.随着片上集成内核数量的不断增大和特征尺寸的持续缩小,延迟、存储可扩展的Cache一致性协议已经成为提升访存效率的关键性因素.文中提出一种基于节点预测的直接Cache一致性协议-NPP协议,研究一致性交互延迟隐藏和目录存储开销减少技术.针对读、写缺失中存在的间接性问题和现有解决方案破坏已有数据局部性、无法获得最近数据副本等问题,分别提出节点挂起技术和直接写缺失处理技术,有效隐藏了目录访问延迟.为了实现准确的节点预测,作者还提出基于“签名”回收的历史信息更新算法,避免了冗余更新和不完整更新.使用SPLASH-2测试程序集,在基于2DMESHNoC互联的64核CMP下,相对于全映射目录协议,NPP协议的平均执行时间降幅为21.78%~31.11%;平均读缺失延迟降低14.22%~18.9%;平均写缺失延迟降低17.89%~21.13%.而获得上述性能提升的代价是网络流量平均增加6.62%~7.28%. The performance promotion of modern processor depends on the excavation of memory system. Along with the booming of cores integrated in chip and the continual shrink of critical size, the cache coherence protocol with good scalability of latency and memory overhead has become the key factor to increase the memory access efficiency. This paper proposes a node predicting based direct cache coherence protocol--NPP, which mainly focuses on the research of techniques for cache coherence transaction latency hiding and memory overhead reduction. To solve the indirection problem in read/write miss transaction and overcome the shortcomings of data locality broken and inability to get the nearest valid copy in existing proposals, we propose the node hanging technique and direct write-miss processing technique to hide the directory access latency in read miss and write miss. In addition, we also propose a signature collection based history information update algorithm to avoid the superfluous or incomplete update. Simulation results show that for a 2D MESH NoC based 64-core CMP, compared to flat full map directory protocol, NPP reduces average execution time by 21.78%-31.11% ,average read miss latency by 14.22%- 18.9% and average write miss latency by 17.89% - 21.13%. Besides the above performance promotion, price of NPP is increasing of on-chip network traffic by 6.62%- 7.28% on average.
出处 《计算机学报》 EI CSCD 北大核心 2014年第3期700-720,共21页 Chinese Journal of Computers
基金 国家"核高基"科技重大专项(2009ZX01039-003-001-03 2009ZX01023-004) 国家自然科学基金(60905007)资助~~
关键词 单芯片多处理器(CMP) 预测 一致性协议 目录 可扩展中图法 chip multi-processor prediction coherence protocol directory scalable
  • 相关文献

参考文献3

二级参考文献27

  • 1Lenoski D. The DASH Prototype: Implementation and Performance[C]// In: Proceedings of 19th Annual International Symposium on Computer Architecture. Gold Coast, Australia:[s.n. ], 2000:92-103.
  • 2Agarwal A. The MIT Alewife Machine:Architecture and Performance[C]//In. Proceedings of 22nd Annual International Symposium on Computer Architecture. Gold Coast, Australia:[s. n. ] ,1999:2- 13.
  • 3Yousif M S, Thazhuthaveetil M J, DAS C R. Cache Coherence in Multiprocessors :A Survey[J]. In: Advanced in Computers, 1995,40: 56 - 101.
  • 4Dubois M, Thakkar S. Scalable Shared Memory Multiprocessots[ M]. Norwell, MA: Kluwer Academic Publisher, 1992.
  • 5Thaper M, Delagi B. Stanford Distributed- Directory Protocol[J]. Computer, 1990,23(6) :78- 79.
  • 6Thapar M, Delagi B, Flynn M J. Linked List Cache Coherence for Scalable Shared Memory Multiproeessors[ C]//In: Proceedings of 7th International Parallel Processing Symposium. Newport Beach, CA, USA: [ s. n. ], 1993 : 34 - 43.
  • 7Stallings W. Computer Organization and Architecture Design for Performance[ D]. [ s. l. ] : Prentice - Hall Internation Inc, 2002.
  • 8Hennessy J L, Patterson D A. Computer Architecture: A Quantitative Approach[M]. 3rd ed. San Francisco: Morgan Kaufmann, 2004.
  • 9Enright Jerger N D. Chip Multiprocessor Coherence and Interconnect System Design[D]. University of Wisconsin-Madison, 2008.
  • 10Martin Milo M K. Token Coherence[D]. University of Wisconsin-Madison, 2003.

共引文献19

同被引文献169

引证文献33

二级引证文献208

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部