期刊文献+

面向虚拟共享域划分的自适应迁移与复制机制

An Adaptive Migration-Replication Mechanism for Virtual Shared Regions Partition
下载PDF
导出
摘要 传统数据管理机制无法感知分布式cache布局的非一致访问延迟特性,导致多核处理器大容量cache失效率和命中延迟之间的矛盾日益加剧.此外,单独依靠数据迁移和盲目复制难以解决共享数据块的竞争访问与长延迟命中问题.基于瓦片式多核处理器分布式cache的虚拟共享域划分机制,提出并实现一种域间数据自适应迁移与复制机制,能够协同感知本地目标bank候选牺牲块状态和远程命中块的局部活跃程度,在多个虚拟共享域间对多核竞争访问的共享数据进行动态迁移和复制决策,综合权衡片上长延迟命中和cache容量有效利用率问题,降低平均存储访问延迟.最后,在全系统模拟器中实现虚拟共享域划分和域间共享数据自适应迁移-复制机制,并采用典型测试程序包SPLASH-2评估性能优化情况.实验表明,与传统固定共享域划分机制和同类优化机制相比,自适应迁移与复制机制在不同共享度下均可获得相应性能提升,面积开销可以忽略不计. The speed gap between processor and memory is constantly widening, which substantially exacerbates the dependence of program performance on the on-chip memory hierarchy design in chip multiprocessors (CMP). However, traditional data management mechanism doesn't take advantage of the property of non-uniform cache access latency in large distributed cache in CMP, which causes the contradiction between miss rate and hit latency is increasingly serious. Furthermore, it is difficult to solve the problems of conflicting access and long latency hit to shared blocks by simply replying on dynamic migration and blind replication. Aiming at these challenges, this paper proposes an adaptive migration-replication (AMR) mechanism based on the virtual shared regions (VSR) partition in tiled CMP. Both the state of the victim candidate in local VSR and the activity degree of remote source line are taken into consideration cooperatively, so that the shared blocks accessed by different processor cores can be migrated and replicated between different VSRs adaptively, which results in the reduction of the average memory access time. Finally, the VSR partition and AMR mechanism are implemented using a full system simulator, and the typical benchmark suit SPLASH-2 is used to evaluate the performance improvement. Simulation results demonstrate that AMR performs well under different sharing degree compared with traditional fixed partition mechanism, while the additional hardware overhead is negligible.
出处 《计算机研究与发展》 EI CSCD 北大核心 2013年第8期1583-1591,共9页 Journal of Computer Research and Development
基金 国家自然科学基金项目(60970036)
关键词 非一致cache体系结构 多核处理器 延迟优化 迁移 复制 non-uniform cache architecture chip multi-processor latency reduction migration replication
  • 相关文献

参考文献12

  • 1Kim C, Burger D, Keckler S W. An adaptive, non-uniform cache structure for wire delay dominated on-chip caches [C] //Proc of Int Conf on Architectural Support for Programming Languages and Operating Systems. New York: ACM, 2002: 211-222.
  • 2Beckmann B M, Wood D A. Managing wire delay in large chip-multiprocessor caches [C]//Proc of the 37th l nr Syrnp on Microarchirecr ure. Piscataway , NJ: IEEE, 2004: 319- 330.
  • 3Chishti Z, Powell M D, Vijaykumar T N. Optimizing replication. communication, and capacity allocation in cmps [C]//Proc of the 32nd Annual Inr Symp on Computer Architecture. Piscataway. NJ: IEEE, 2005: 357-368.
  • 4Huh J, Kim C, Shafi H, et al. A nuca substrate for flexible crnp cache sharing [J]. IEEE Trans on Parallel and Distributed Systems, 2007, 18(8): 1028-1040.
  • 5Hammoud M, Cho S, Melhem R. Dynamic cache clustering for chip multiprocessors [C]//Proc of Int Conf on Supercomputing. New York: ACM. 2009: 56-67.
  • 6Mihara T. Inoue K, Murakami K. Adaptive management of cache block replication for high-performance cmp [C]//Proc of Workshop on Memory Performance: Dealing with Applications, Systems and ArchitectureCHeld in Conjunction with Parallel Architectures and Compilation Techniques). Piscataway. NJ: IEEE, 2007: 1-7.
  • 7Magnusson P S. Christensson M. Eskilson J, et al, Sirnics . A full system simulation platform [J]. IEEE Computer, 2002. 35(2): 50-58.
  • 8Owens J D, Dally W J, Ho R. et al, Research challenges for on-chip interconnection networks [J]. IEEE Micro. 2007,27 (5): 96-108.
  • 9Bjerrcgaard T, Mahadevan S. A survey of research and practices of net work-on-chip [J]. ACM Comput ing Surveys, 2006, 38(): 71-121.
  • 10Beckmann B M, Marty M R, Wood D A. Asr, Adaptive selective replication for cmp caches [C]//Proc of Int Syrnp on Microarchitecture: Piscataway. NJ: IEEE, 2006: 443-454.

二级参考文献2

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部