期刊文献+

SPM结构上冗余读延迟写优化的设计与实现

DESIGNING REDUNDANT READ AND DELAY WRITE OPTIMISATION ON SPM ARCHITECTURE AND ITS IMPLEMENTATION
下载PDF
导出
摘要 随着微处理器架构的发展,将片上SRAM组织成SPM这种软件管理的非cache结构成为众多处理器的选择。SPM结构的特点是实现简单,访问延迟低、带宽高。要有效利用有限的片上SPM空间提升程序性能,必须由用户显式进行数据的布局和传送,或者由编译器进行高效的自动访存优化。冗余读延迟写优化从循环中多个主存访问之间的关联性出发,自动进行了数据传送和缓存优化,提高了SPM上的数据重用率。经过测试,可以有效提升程序性能。 With the development of microprocessor architecture, non-cache structure becomes the option of many processors which configures the on-chip SRAM memory as the software-managed seratehpad memory (SPM). The features of SPM structure include simple in implementation, low access latency and high bandwidth. For effectively utilising the limit on-chip SPM space to improve program' s performance, it has to either to distribute and transmit data explicitly by the user or to efficiently optimise the automatic memory access by the compiler. As to the redundant read and delay write optimisation, proceeding from the association between multiple memories access in circulation, it carries out data transmission and cache optimisation automatically, and improves the data reuse rate on SPM. After test, it is able to effectively raise the performance of the program.
出处 《计算机应用与软件》 CSCD 2015年第2期10-13,共4页 Computer Applications and Software
基金 国家高技术研究发展计划项目(2012AA010903)
关键词 SPM 访存优化 关联性 冗余读延迟写 数据重用 SPM Memory access optimisation Association Redundant read & delay write Data reuse
  • 相关文献

参考文献8

  • 1IBM Microelectronics: Cell Broadband Engine[ OL]. http://www-O1. ibm. com/. chips/techlib/techlib, nsf/products/Cell Broadband En- gine.
  • 2Maciej C. Cell Programming: PEACE Workshop [ C ]//New Languages & Future Technology Prototypes, March 1-2, LRZ, Germany ,2010.
  • 3Perez J M, Bellens P,Badia R M, et al. CellSs Making it easier to program the Cell Broadband Engine[ J]. IBM J. RES. &DEV. 2007, 51(5).
  • 4Tallada M G. OpenMP on the IBM Cell BE[ C]//15th Meeting of Sci- comP Barcelona Supercomputing Center(BSC) ,2009,18- 22.
  • 5IBM. Accelerated Library Framework for Cell Broadband Engine Pro- grammer' s Guide and API Reference. Software Development kit for Muhicore Acceleration Version 3. 0 [ OL]. http://moss, csc. ncsu. edu/- mueller/cluster/ps3/SDK3.0/docs/lib/ALF_Prog_Guide_API -v3.0.pdf.
  • 6刘勇,刘丽,何王全.面向众核多级访存资源的静态数据布局优化模型[J].计算机应用与软件,2011,28(7):53-56. 被引量:2
  • 7孙守航,杨灿群.Cell处理器上软件缓存的设计与实现[J].计算机工程,2011,37(2):45-47. 被引量:3
  • 8Dakar T. SHOC:The Scalable HeterOgeneous Computing Benchmark Suite [ R ]. Future Technologies Group, Oak Ridge National Laboratory, Novermber 2011.

二级参考文献11

  • 1iBM Corp.. Cell Broadband Engine Programming Handbook, Version 1.0[Z]. 2006.
  • 2Eichenberger A E. Using Advanced Compiler Technology to Exploit the Performance of the Cell Broadband Engine Architecture[J]. IBM Systems Journal, 2006, 45(1): 59-84.
  • 3孙守航,杨灿群,李春江,等.OpenMPC编译器在Cell上的实现[M]//全国软件与应用学术会议论文集.西安:[出版者不详],2007.
  • 4Balart J, Gonzalez M, Martorell X, et al. A novel asynchronous software cache implementation for the Cell-BE Processor[ C ]// Proceedings of 'the 2007 Workshop on Languages and Compilers for Parallel Compu- ting. Urbana, Illinois ,2007 : 125 - 140.
  • 5Silberstein M, Schuster A, Geiger D, et al. Efficient computation of sum- product s on GPUs through software-managed cache[ C]// Proceedings of t he 22nd ACM International Conference on Supercomputing. Island of Kos, Greece,2008:309 - 318.
  • 6Tong Chen, Zehra Sura, Kathryn O'Brien, et 8.1. Optimizing the use of static buffers for DMA on a Cell chip[ C]//19th International Work-shop on Languages and Compilers for Parallel Computing, November 2 - 4,2006, New Orleans, Louisiana.
  • 7Knight T J, Park J Y, Ren M, et al. Compilation for explicitly managed memory hierarchies [ C ]// Proceedings of the 12th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming ( PpoPP' 07 ) . New York, N Y, USA ,2007:226 - 236.
  • 8Houston M, Park J Y, Ren M, et al. A portable runtime interface for multi-level memory hierarchies [ C ]// Proceedings of the 13th ACM S/GPLAN Symposium on Principles and Praetiee of Parallel Programming ( PPOPP 2008 ). Salt Lake City, Utah, USA ,2008 : 143 - 152.
  • 9冯国富,董小社,胡冰,王旭昊,王恩东.一种支持多种访存技术的CBEA片上多核MPI并行编程模型[J].计算机学报,2008,31(11):1965-1974. 被引量:6
  • 10冯国富,董小社,丁彦飞,王旭昊.面向Cell宽带引擎架构的异构多核访存技术[J].西安交通大学学报,2009,43(2):1-5. 被引量:10

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部