期刊文献+

MPCore Cache带宽测试及其对并行编程的影响 被引量:1

Test of the MPCore cache bandwidthand considerations for efficient software execution
下载PDF
导出
摘要 片上多处理器(CMP)在不提高处理器频率的情况下能提高CPU的性能,但多核处理器在对共享数据进行并行运算时存在cache一致性问题,导致每个CPU的数据传输带宽和程序运行效率降低。针对这个问题,在Linux操作系统环境下对MPCore各级cache性能及cache-to-cache传输性能进行了测试,结果表明采用cache-to-cache的数据传输方式能有效降低主存的负载。根据测试结果,提出在MPCore处理器上采用SPPM(synchronized pipelined parallelism model)模型进行并行编程的方法,通过实验证明在进行并行运算时,SPPM模型的运行效率高于SDM (the spatial decomposition model)模型。 Chip multiprocessors (CMPs) provide high CPU performance without improving the frequency of processor, but when the CMP deals with share data, the cache coherency is problem that tends to put more pressure on the memory interface than their single-thread counterparts and decrease the efficiency of the parallel program. Aiming at this problem, the bandwidths of different level caches and cache-to-cache transfer are tested in Linux environment. The test result shows that the cache-to-cache transfer can decrease the load of main memory. According to the test result, the SPPM (synchronized pipelined parallelism model) that used to design the parallel program on MPCore is presented. The experiment results show the parallel program that adopting SPPM has more efficient than SDM.
作者 周余 都思丹
出处 《电子测量技术》 2008年第6期166-169,共4页 Electronic Measurement Technology
关键词 片上多处理器 ARM11 MPCore缓存一致性 带宽测试 CMP MPCore cache coherency bandwidth testing
  • 相关文献

参考文献7

  • 1ARM Limited. ARM 11 MPCore Processor Technical Reference Manual [Z]. 2006.
  • 2ARM Limited . Core Tile for ARM 11 MPCore User Guide [Z]. 2006.
  • 3MCVOY L, STAELIN C. Lmbench: Portable tools for performance analysis[A]. USENIX Annual Technical Conference[C].San Diego, California:USENIX, 1996:279-294.
  • 4VADLAMANI S, JENKS S. Architectural Considerations for Efficient Software Execution on Parallel Microprocessors[J].IEEE International Parallel and Distributed Processing Symposium, 2007 (3):1-10.
  • 5GAO G R, LUBOMIR, JEANLUC G. Advanced Topics in Dataflow Computing andMultithreading[M].Los Alamitos :IEEE Computer Society Press, 1995:87 -101.
  • 6DIMITRIOS S. NIKOLOPOULOS. Code and Data Transformations for Improving Shared Cache Performance on SMT Processors.[J]. In Proceedings of the 5th International Symposium on High Performance Computing, Tokio-Odaiba, Japan, 2003.
  • 7宿绍莹,刘平,陈曾平.宽带实时频谱分析技术研究与实现[J].电子测量与仪器学报,2007,21(5):113-117. 被引量:17

二级参考文献4

  • 1[美]徐(Tsui,J.)著,杨小牛等译.宽带数字接收机[M].北京:电子工业出版社,2002,1-4.
  • 2[美]威特(Witte,R.A.)著,李景威等译.频谱和网络测量[M].北京:科学技术文献出版社,1997,77-80.
  • 3Tektronix Company. Fundamentals of Real-Time Spectrum Analysis[ M/CD]. 2005 : 3 -42.
  • 4[美]奥本海姆(Oppenheim,A.V.)等著,刘树棠等译.《离散时间信号处理》(第二版)[M].西安:西安交通大学出版社,2001,512-532.

共引文献16

同被引文献10

  • 1Gordon M.Cramming more components on to integrated circuits.Electronics Magazine,1965,38(8):114-117.
  • 2Tullsen D M,Eggers S,Levy H M.Simultaneous multithreading:Maximizing on-chip parallelism.Proceedings of International Symposium on Ompater Arthitecive,1995,22-25.
  • 3Hammand L,Nayfeh B A,Olukotun K.A single-chip muhiprocessor.IEEE Computer,1997,30(9):79-85.
  • 4Kalla R,Sinharoy B,Tendler J.Power5:IBM's next generation power microprocessor.Proceedings of 15^th Hot Chips Symposium,2003,292-303.
  • 5Kalla R,Sinharoy B,Tendler J M.IBM power5 chip:A dual-core muhithreaded processor.Micro,IEEE,2004,24(2):40-47.
  • 6Li Y M,Skadron K,Brooks D,et al.Performance,energy and thermal considerations for SMT and CMP architectures.High-Performance Computer Architecture,2005 HPCA-11th International Symposium,2005,71-82
  • 7肖波,沈庆宏,丁银亮.一种无锁相环的高精度数字视频彩色解码方案的研究[J].南京大学学报(自然科学版),2009,45(1):11-17. 被引量:7
  • 8王伟希,袁杰,臧炅,丁银亮.基于局部特征的点模式指纹匹配算法[J].南京大学学报(自然科学版),2009,45(1):18-23. 被引量:14
  • 9王伟,都思丹.基于MPCore与Linux的中断亲和性研究[J].南京大学学报(自然科学版),2009,45(1):24-30. 被引量:9
  • 10郑峰,沈庆宏,黄勇才.NTSC视频信号的数字解码方案研究与实现[J].南京大学学报(自然科学版),2009,45(1):31-38. 被引量:8

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部