期刊文献+

自动并行化中不规则循环的通信代码生成

Communication code generation for automatic parallelization of irregular loops
下载PDF
导出
摘要 不规则计算在大规模并行应用中广泛存在。在面向分布存储结构的自动并行化过程中,较难在编译时为不规则循环生成并行代码。并行代码中的通信代码对程序运行结果的正确性以及加速效果有着严重的影响。通过分析程序的数组重分布图,使用部分冗余的通信方式来维持不规则数组访问的生产者-消费者关系,可以在编译时为一类常见的不规则循环自动生成有效的通信代码。该方法使用计算分解和数组引用的访问表达式求解不规则数组在各处理器的本地定义集作为通信的数据集,分析针对此类不规则循环划分的通信策略,继而生成相应的通信代码。实验测试的结果取得了预期的加速效果,验证了方法的有效性。 Irregular computing exists in large scale parallel application widely and the automatic parallelization on distributed memory is hardly to generate parallel code for irregular loops at compile-time.The communication code of the parallel code influences the correctness and the efficiency to the runout of the program.It could automatically generate useful communication code for a common class of irregular loops at compile-time by using the approach of partial communication redundancy,that needed analyzing the array redistribution graph of the program to maintain the producer-consumer relation of irregular array references.The approach searched the local definition set of the irregular array on each processor by computation decomposition and accessed expression of array references as the communication data set,then analyzed the communication strategies for such irregular loops and generated the corresponding communication code.The experimental results show the validity of the approach and the expectant speedup of test applications.
出处 《计算机应用》 CSCD 北大核心 2014年第4期1014-1018,共5页 journal of Computer Applications
基金 "核高基"国家科技重大专项(2009ZX01036-001-001-2)
关键词 自动并行化 不规则循环 不规则数组 计算分解 部分冗余 automatic parallelization irregular loop irregular array computation decomposition partial redundancy
  • 相关文献

参考文献16

  • 1胡长军,张纪林,王珏,李建江.非规则、核外并行计算研究综述[J].小型微型计算机系统,2008,29(11):1969-1978. 被引量:1
  • 2FERNER C S. Revisiting communication code generation algorithms for message-passing systems [ J]. International Journal of Parallel, Emergent and Distributed Systems, 2006, 21(5): 323 -344.
  • 3BONDHUGULA U, HARTONO A, RAMANUJAM J, et al. A prac- tical automatic polyhedral parallelizer and locality optimizer [ C]// PLDI'08: Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation. New York: ACM, 2008: 101-113.
  • 4BONDHUGULA U. Automatic distributed-memory parallelization and code generation using the polyhedral framework, IISc-CSA-TR-2011- 3 [ R]. Bangalore: Indian Institute of Science, 2011.
  • 5GUO M, PAN Y, LIU Z. Symbolic communication set generation for irregular parallel applications [ J]. The Journal of Supercomputing, 2003, 25(3) : 199 -214.
  • 6胡长军,李静,王珏,姚广利,李永红,丁良,李建江.一类非规则并行应用问题的通信集生成算法[J].计算机学报,2008,31(1):120-126. 被引量:2
  • 7RAVISHANKAR M, EISENLOHR J, POUCHET L-N, et al. Code generation for parallel execution of a class of irregular loops on dis- tributed memory systems [ C]//SC'12: Proceedings of the 2012 In- ternational Conference for High Performance Computing, Networ- king, Storage, and Analysis. Los Alamitos: IEEE Computer Socie- ty, 2012: 1-11.
  • 8STROUT M M, GEORGE G, OLSCHANOWSKY C. Set and rela- tion manipulation for the sparse polyhedral framework [ C]// LCPC 2012: Proceedings of the 25th International Workshop on Languages and Compilers for Parallel Computing, LNCS 7760. Berlin: Spring- er-Verlag, 2012:61-75.
  • 9LAMIELLE A, STROUT M M. Enabling code generation within the sparse polyhedral framework, CS-10-102 [ R]. Fort Collins, CO:Colorado State University, 2010.
  • 10BASUMALLIK A, EIGENMANN R. Optimizing irregular shared- memory applications for distributed-memory systems [ C ]// PPOPP'06: Proceedings of the 1 l th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. New York: ACM, 2006: 119-128.

二级参考文献122

  • 1王丽宏,方滨兴,胡铭曾.Design and realization of runtime library support for irregular computation[J].Journal of Harbin Institute of Technology(New Series),2001,8(2):159-163. 被引量:1
  • 2夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量:7
  • 3Daisuke Takahashi, Mitsuhisa Sato, Taisuke Boku. Performance evaluation of the hitachi SR8000 using OpenMP benchmarks [A]. In: Proe of the 4th International Symposium on High Performance. Comp[C]. Lecture Notes In Computer Science, 2002,2327: 390-400.
  • 4Brezany P, Choudhary A, Dang M. Parallelization of irregular codes including out-of-core data and index arrays[A]. In: Proc of the conference Parallel Computing 1997[C]. North Holland: Elsevier Press, 1998, 132-140.
  • 5Angela Demke Brown. Explicit compiler-based memory management for out-of-core applications [D]. School of Computer Science Carnegie Mellon University, 2005.
  • 6Su J, Yeliek K. Array prefetehing for irregular array accesses in titanium[A]. In: Proe of the IPDPS' 04[C]. IEEE Computer Society Press, 2004, 158.
  • 7Su J, Yelick K. Automatic support for irregular computations in a high-level language[A]. In: Proc of the IPDPS' 05[C]. IEEE Computer Society Press, 2005.
  • 8Ferreira R, Agrawal G, Saltz J H. Data parallel language and compiler support for data intensive applications[J]. Journal of Parallel Computing, 2002, 28(5): 725-748.
  • 9Fu Cong, Tao Yang. Run-time techniques for exploiting irregular task parallelism on distributed memory architectures[J]. Journal of Parallel and Distributed Computing, 1997, 42 (2):143-156.
  • 10Cirou B, Marie Christine Counilh, Jean Roman. Programming irregular scientific algorithms with static properties on clusters of SMP nodes[A]. ICPP Workshops 2005[C]. IEEE Computer Society Press, 2005, 145-152.

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部