期刊文献+

基于GCC编译器的流式存储优化方法 被引量:2

Optimization Method of Streaming Storage Based on GCC Compiler
下载PDF
导出
摘要 针对流式存储访问引起的缓存污染与强制性缺失问题,部分高性能通用处理器平台提供了不经过缓存而直接访问存储器的专用通路及配套指令支持。在常见的流式存储应用场景中,合理采用直访主存方式可以提高芯片存储器系统的整体性能。然而,判断何时使用直访主存能够获得收益对于程序员来说是一项十分繁琐且容易出错的任务,一种行之有效的方法是通过编译器自动实现。因此,文中在深入分析流式存储访问模式使用不同类型访存操作性能收益的基础上,提出了基于GCC编译器的流式存储优化方法。该方法由编译器自动实现对程序员透明,在GCC编译器SSA-GIMPLE阶段对程序循环中具有流式访问特征的连续写或者跨步写进行识别,并根据收益分析与依赖关系筛选优化对象,最后在编译器后端匹配指令模板生成直访主存指令。使用连续/跨步写用例与STREAM测试集及变体在申威国产处理器平台上进行实验评估,结果表明,文中提出的优化方法能够显著缩短流式存储应用程序的执行时间,优化后STREAM测试集的平均加速比为1.31。另外,文中实现的流式存储优化与循环展开优化一起使用效果更好,STREAM测试集的平均加速比能达到1.45。 To solve the problem of cache pollution and mandatory loss caused by streaming memory access,some high-perfor-mance general-purpose processor platforms provide a dedicated path and supporting instructions for accessing memory directly without accessing the cache.The overall performance of chip memory system can be improved by using direct memory access in common application scenarios such as streaming storage.However,it is a tedious and error-prone task for programmers to determine when direct access to main memory is beneficial,and an effective way is to implement it automatically through the compiler.Therefore,based on the in-depth analysis of the benefits of different types of access operations under the streaming storage access mode,this paper proposes a streaming storage optimization method based on GCC compiler.In the SSA-GIMPLE stage of GCC compiler,the continuous write or step write with stream access characteristics in the program loop is recognized,and optimization objects are screened according to the benefit analysis and dependency relationship.Finally,the direct access main memory instructions are generated by matching instruction templates at the back end of compiler.The continuous/step-write case and STREAM test set and their variants are used for experimental evaluation on SW domestic processor platform.The results show that the optimized method can significantly reduce the execution time of STREAM storage applications,and the average acceleration ratio of STREAM test set after optimization is 1.31.Additionally,in conjunction with loop unwinding optimization,the STREAM test set has an average acceleration ratio of 1.45.
作者 高秀武 黄亮明 姜军 GAO Xiu-wu;HUANG Liang-ming;JIANG Jun(Jiang Institute of Computing Technology,Wuxi,Jiangsu 214083,China)
出处 《计算机科学》 CSCD 北大核心 2022年第11期76-82,共7页 Computer Science
基金 国家重点研发计划(2020YFB0204602) 综合研究项目(针对申威处理器的编译优化提升技术)。
关键词 GCC编译器 直访主存 编译优化 代码生成 国产处理器 GCC complier Direct memory access Compiler optimization Code generation Domestic processor
  • 相关文献

参考文献2

二级参考文献21

  • 1Hennessy J L,Patterson D A.Computer architecture:a quantitative approach [M].Elsevier,2012.
  • 2Sailing.浅谈Cache Memory[EB/OL].(2011-10-03)[2015-3-17].http://blog.sina.com.cn/s/blog_6472c4cc0102dw61.html.
  • 3Intel Corporation.Intel64 and IA-32 Architectures Optimization Reference Manual [EB/OL].[2015-03-05].http://www.intel.com/content/www/ us/en/processors/architectures-software-developer-manuals.html.
  • 4Intel Corporation.Intel 64 and IA-32 Architectures Software Developer’s Manual Volume 1:Basic Architecture [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.
  • 5Intel Corporation.Intel 64 and IA-32 Architectures SoftwareDeveloper’s Manual Documentation Changes [EB/OL].[2015-03-05].http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html.
  • 6Intel Corporation.Intel Instruction Set Architecture Extensions [EB/OL].[2014-12-31].https://software.intel.com/en-us/intel-isa-extensions.
  • 7Free Software Foundation,Inc.GCC,the GNU Compiler Collection [EB/OL].(2014-12-23)[2015-03-05].https://gcc.gnu.org.
  • 8Intel Corporation.IntelParallel Studio XE 2015 ComposerEdition C++Release Notes [EB/OL].(2014-06-25)[2015-03-05].https://software.intel.com/en-us/articles/intel-parallel-studio-xe-2015-composer-edition-c-release-notes.
  • 9Intel Corporation.IntelXeonProcessor E5-1600/E5-2600/E5-46 00 Product Families Datasheet Volume One [EB/OL].[2015-03-05].http://www.intel.com/products/processor%5Fnumber/.
  • 10Intel Corporation.An Introduction to the Intel QuickPath Interconnect[EB/OL].[2009-01-30].http://www.intel.com.

同被引文献19

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部