期刊文献+

一种面向众核架构的数据流编译框架 被引量:2

A Compilation Framework of Dataflow Programs for Many-Core Architecture
下载PDF
导出
摘要 数据流编程模型将程序设计与媒体处理相结合,已大量应用到各个领域.众核处理器已经成为主流和工业标准,如何利用众核架构的特性来提高流应用执行性能已成为目前研究工作的一大难点.文中提出了一个高效的流编译框架来优化流应用的执行,该框架包含3个优化策略:设计一个最优的软件流水调度方法;提出一个高效的数据存储分配算法;并采用合理的众核间的映射策略,减小通信以及同步的开销.文中在Godson-T上实现了该编译器框架,实验结果表明,该方法比优化前有较大性能改进. Domain specific programming like Dataflow Programming Model which combines the features of media applications and programming languages has applied to many fields. Many-core architecture has become the mainstream solution and industry standard, how to use the character- istic of many-core architecture to improve the performance of stream applications has become a difficulty in present research work. In order to solve these problems, we propose an efficient stream compilation framework for many-core architecture to optimize the execution of stream applications, which is composed of three optimization strategy. In the first phrase, rate-optimal software pipelining schedule is constructed to improve parallelism. Then, a buffer allocation algorithm is proposed to allocate the data for pipelining schedule and redundant buffer copy operation is eliminated. The last phase maps the logical cores to the physical cores to reduce the communi- cation overhead. We also implement the framework on Godson-T and the experiments show that our method obtains about an average 58% improvement.
出处 《计算机学报》 EI CSCD 北大核心 2014年第7期1560-1569,共10页 Chinese Journal of Computers
基金 国家"八六三"高技术研究发展计划重点项目(2012AA010902) 高等学校博士学科点专项科研基金(20120142110089) 中国科学院计算技术研究所国家重点实验室开放基金 IBM X10 Innovation基金资助~~
关键词 编译框架 数据流程序 众核处理器 软件流水 并行 compilation framework; data flow programs many-core processor software pipelining parallelism
  • 相关文献

参考文献13

  • 1Waingold E,et al.Baring it all to software:Raw machines.IEEE Computer,1997,30(9):86-93.
  • 2Tan G,Fan D,Zhang J,et al.Experience on optimizing irregular computation for memory hierarchy in manycore architecture//Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP08).Salt Lake City,USA,2008:279-280.
  • 3Howard J,et al.A 48-core IA-32 message-passing processor with DVFS in 45 nm CMOS//Proceedings of the Solid State Circuits Conference Digest of Technical Papers (ISSCC).San Francisco,USA,2010:108-109.
  • 4Wei Haitao,Yu Junqing,Yu Huafei,Gao Guang R.Minimizing communication in rate optimal software pipelining for stream programs//Proceedings of the 2010 International Symposium on Code Generation and Optimization (CGO).Toronto,Canada,2010:210-217.
  • 5Gordon M I,Thies W,Amarasinghe S.Exploiting coarsegrained task,data,and pipeline parallelism in stream programs//Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems.New York,USA,2006:151-162.
  • 6Steinke S,Wehmeyer L,Lee B-S,Marwedel P.Assigning program and data objects to scratchpad for energy reduction//Proceedings of the Conference on Design,Automation and Test in Europe (DATE'02).Paris,France,2002:409-415.
  • 7Lam M.Software pipelining:An effective scheduling technique for VLIW machines//Proceedings of the SIGPLAN' 88 Conference on Programming Language Design and Implementation.Atlanta,USA,1988:318-328.
  • 8Choi Y,Lin Yuan,Chong N,et al.Stream compilation for real time embedded multicore systems//Proceedings of the 2009 International Symposium on Code Generation and Optimization(CGO).Seattle,USA,2009:210-220.
  • 9Verma M,Wehmeyer L,Marwedel P.Dynamic overlay of scratchpad memory for energy minimization//Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis.Stockholm,Sweden,2004:104-109.
  • 10Avissar O,Barua R.An optimal memory allocation scheme for scratchpad-based embedded systems.IEEE Transactions on Embedded Computing Systems,2002,1(1):6-26.

同被引文献10

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部