期刊文献+

帮助线程预取性能的分析与优化

Performance Analysis and Optimization of Prefetching Thread on CMPs
下载PDF
导出
摘要 针对访存延迟对现代处理器性能的影响,基于片上多处理器分析与测试了访存密集型应用程序的帮助线程数据预取性能。结果表明热点区计算/访存延迟比率对帮助线程预取性能有重大影响。依据热点区计算/访存延迟比率合理安排帮助线程与主线程的访存任务比例时,能达到对帮助线程性能的优化,使帮助线程预取获得更好的性能收益。基准测试程序的测试实验结果表明当热点区计算量很小可以忽略不计时,帮助线程与主线程的访存任务比接近1时,帮助线程预取获得最好的性能收益。 Memory latency has become a critical bottleneck in achieving high performance on modern processors.Prefetching thread based on multiprocessor(CMP) is a well known approach to reducing memory latency and has been explored in different applications.In this paper,we analyze the performance of prefetching thread for memory intensive applications.The analysis and experimental result show that computation/access latency ratio(CALR) of hotspots has an important affect on prefethcing performance.When the memory access ratio between main thread and prefetching thread is close to(1 CALR)/(1+CALR),prefetching thread gains better performance.The thread prefetching performance of several benchmarks from Olden and SPEC2006 benchmark suite is tested,and the experimental results reflect the impact of different memory access ratio between Prefetching thread and main thread.
作者 黄艳 古志民
出处 《电子科技大学学报》 EI CAS CSCD 北大核心 2012年第1期85-91,共7页 Journal of University of Electronic Science and Technology of China
基金 教育部-英特尔信息技术专项科研基金(MOE-INTEL-08-10) 北京市重点学科建设项目
关键词 片上多处理器 计算/访存延迟比率 热点区 性能分析 预取线程 CMP computation/access latency ratio(CALR) hotspot performance analysis prefetching thread
  • 相关文献

参考文献14

  • 1LUK C. Tolerating memory latency through software- controlled preexecution in simultaneous multithreading proeessors[C]//Proc Int'l Syrup on Comp Arch.. Goteborg, Sweden: Institue of Electrical and Electronic Enginecrs Society, 2001: 40-51.
  • 2GRANNAES M, JAHRE M, NATVIG L. Low-cost open-page prefetch scheduling in chip multiprocessors[C]// Proceedings of the IEEE International Conference on Computer Design. Lake Tahoe, CA: IEEE Press, 2008: 390-396.
  • 3ZHANG Wei-feng, CALDER B, TULLSEN D M. A self-repairing prefetcher in an event-driven dynamic optimization framework[C]//Proceedings of the International Symposium on Code Generation and Optimization. New York: IEEE Computer Society, 2006: 50-64.
  • 4COLLINS J D, TULLSEN D M, WANG H, et al. Dynamic speculative precompuation[C]//Proceedings of the 34th International Symposium on Microarehitecture. LOS Alamitos: IEEE Press, 2001: 306-317.
  • 5LIAO S S W, WANG P H W, HOEHNER G~ et al. Post-pass binary adaptation for sottware-based speculative precomputation[C]//Proceedings of the ACM SIGPLAN'02 Conference on Programming Language Design and Implementation. Berlin: ACM, 2002:117-128.
  • 6KIM D, YEUNG D. Design and evaluation of compiler algorithms for pre-execution[C]//Proceedings of thel0th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose: IEEE, 2002: 159-170.
  • 7LU J, DAS A, HSU W, et al. Dynamic helper threaded prefetching on the sun ultrasparc cmp processor[C]// Proceedings of the 38th International Symposium on Microarchitecture. Barcelona: Computer Society, 2005: 93- 104.
  • 8ZHANG Wei-feng, TULLSEN D M, CALDER B. Accelerating and adapting precomputation threads for effeient prefetching[C]//Proceedings of the 13th Symposium on High-Performance Computer Architecture. Piscataway: IEEE Press, 2007: 85-95.
  • 9CHILIMBI T M, HIRZEL M. Dynamic hot data stream prefetching for general-purpose programs[C]//Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation. Berlin: Association for Computing Machinary, 2002: 199-209.
  • 10SONG Y, KALOGEROPULOS S, TIRUMALAI E Design and implementation of a compiler framework for helper threading on multi-core processors[C]//Proceedings of the International Conference on Parallel Architectures and Compilation Techniques. LOS Alamitos: IEEE Computer Society, 2005: 99-109.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部