期刊文献+

基于嵌套循环分类的并行识别技术 被引量:5

Parallelism Recognition Technology Based on Nested Loops Classifying
下载PDF
导出
摘要 传统的分布存储并行编译系统大多是在共享存储并行编译系统的基础上开发的.共享存储并行编译系统的并行识别技术适合OpenMP代码生成,实现方式是将所有嵌套循环都按照相同的识别方法进行处理,用于分布存储并行编译系统必然会导致无法高效发掘程序的并行性.分布存储并行编译系统应根据嵌套循环结构的特点进行分类处理,提出适合MPI代码生成的并行识别技术.为解决上述问题,根据嵌套循环的结构和MPI并行程序的特点,提出了一种新的嵌套循环分类方法,并针对不同的嵌套循环分别提出了相应的并行识别技术.实验结果表明,与采用传统并行识别技术的分布存储并行编译系统相比,按照所提方法对嵌套循环进行分类,采用相应并行识别技术的编译系统能够更高效地识别基准程序中的并行循环,自动生成的MPI并行代码其性能加速比提高了20%以上. Existing distributed memory parallelizing compiler systems are mostly developed based on shared systems.The parallelism recognition technologies of shared memory parallelizing compiler systems are suitable for OpenMP code generation.Their implementation is used to recognize all nested loops by the same technology,so that the parallelism cannot be efficiently explored when applying them to distributed memory parallelizing compiler systems.Thus,this paper proposes some parallelism recognition technologies suitable for the MPI code generation for distributed memory parallelizing compiler systems by classifying the nested loops according to their structures.To solve these problems,a new classification method of nested loops is proposed,according to the structure of nested loops and characteristics of MPI parallel program.Corresponding parallelism recognition technologies for different nested loops are also presented,respectively.The experimental results show that compared with the distributed memory parallelizing compiler systems that used existing parallelism recognition technologies,the compiler systems,which use the proposed classification method and the corresponding recognition technologies,can more efficiently recognize parallel nested loops in the benchmark programs,and the performance speedup of the MPI codes automatically increased to more than 20%.
出处 《软件学报》 EI CSCD 北大核心 2012年第10期2695-2704,共10页 Journal of Software
基金 "核高基"国家科技重大专项(2009ZX01036-001-001-2)
关键词 并行编译 并行识别 嵌套循环 模型法 遍历法 交互法 parallelizing compiler parallelism recognition nested loops model algorithm traverse algorithm interaction algorithm
  • 相关文献

参考文献12

  • 1Zou DQ, He LG, Jin H, Chen XG. CRBAC: Imposing multi-grained constraints on the RBAC model in the multi-application environment. Journal of Network and Computer Applications, 2009,32(2):402-411. [doi: 10.1016/j.jnca.2008.02.015].
  • 2Hall MW, Amarasinghe SP, Murphy BR, Liao S, Lam MS. Interprocedural parallelization analysis in SUIF. ACM Trans. on Programming Languages and Systems, 2005,27(4):662-731. [doi: 10.1145/1075382.1075385].
  • 3Lin M, Yu ZY, Zhang D, Zhu YM, Wang SY, Dong Y. Retargeting the Open64 compiler to PowerPC processor. In: Proc. of the Embedded Software and Systems Symposia. San Francisco: IEEE Computer Society Press, 2008. 152-157. http://doi.ieeecomputer society.org/10.1109/ICESS.Symposia.2008.69 [doi: 10.1109/ICESS.Symposia.2008.69].
  • 4Kwon D, Hart S, Kim H. MPI backend for an automatic parallelizing compiler. In: Proc. of the 14th Int'l Symp. on Parallel Architectures, Algorithms and Networks. San Francisco: IEEE Computer Society Press, 1999. 152-157. http://doi.ieeecomputer society.org/10.1109/ISPAN. 1999.778932 [doi: 10.1109/ISPAN. 1999.778932].
  • 5Ding Y, Kandemir M, Irwin MJ, Raghavan P. Adapting application mapping to systematic within-die process variations on chip multiprocessors. Lecture Notes in Computer Science, 2009,5409(1);231-247. [doi: 10.1007/978-3-540-92990-1_18].
  • 6Ferner CS. The paraguin compiler--Message-Passing code generation using SUIF. In: Proc. of the IEEE SoutheastCon 2002. Piscataway: IEEE Press, 2002. 1-6. [doi: 10.1109/.2002.995545].
  • 7Allen R, Kennedy K. Optimizing Compilers for Modem Architectures: A Dependence-Based Approach. Morgan Kaufmann Publishers, 2001.
  • 8Hu C J, Li J, Wang J, Yao GL, Li YH, Ding L, Li JJ. Communication set generation for a special case of irregular parallel applications. Chinese Journal of Computers, 2008,31(1): 120-126 (in Chinese with English abstract).
  • 9BastoulC, Cohen A, Girbal S, Sharma S, Temam O. Putting polyhedral loop transformations to work. In: Rauchwerger L, ed. Proc. of the 16th Int'l Workshop on Languages and Compilers for Parallel Computing. Berlin: Springer-Verlag, 2004. 209-225. [doi: 10. 1007/978-3-540-24644-2_14].
  • 10Hoefler T, Lumsdaine A, Dongarra J. Towards efficient map reduce using MPI. In: Ropo M, Westerholm J, Dongarra J, eds. Proc. of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. Berlin: Springer-Verlag, 2009. 240-249. [doi: 10.1007/978-3-642-03770-2_30].

同被引文献50

  • 1张海荣,朱信忠,赵建民,徐慧英.一种优化的基于用户聚类的过滤推荐策略[J].计算机系统应用,2008,17(11):95-97. 被引量:6
  • 2夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量:7
  • 3王轶然,陈莉,冯晓兵,张兆庆.全局部分重复计算划分[J].计算机研究与发展,2006,43(12):2158-2165. 被引量:2
  • 4Feautrier P. Dataflow analysis of array and scalar references[ J]. In- ternational Journal of Parallel Programming, 1991,2 ( 1 ) :23-53.
  • 5Maydan D E, Amarasinghe S P, Lain M S. Army data-flow analysis and its use in array privatization[ C]. Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on POPL '93,1993:2-15.
  • 6Tomofumi Y, Sanjay R. Canonic multi-projection:memory alloca- tion for distributed memory parallelization [ R]. CSl1-106, Colo- rado: Colorado State University ,2011.
  • 7Wang S S, Zhao R C, Pang J M. Improvement and implementation of accurate array data-flow analysis [ C ]. Proceedings of The 2006 International Conference on Parallel & Distributed Processing Tech- niques and Applications & Conference on Real - Time Computing Systems & Applications (PDPTA'06) ,2006.
  • 8Gu J ,Li Z. Efficient interprocedural array data-flow analysis for au- tomatic program parallelization[ J]. IEEE Transactions on Software Engineering, 2000,26 ( 3 ) : 244-261.
  • 9Bosilca G,Bouteiller A,Danalis A,et al. From serial loops to paral- lel execution on distributed systems [ C ]. In: Kaklamanis C, Papa- theodorou T, Spirakis P G ed. Parallel Processing, Proceedings of 18th Euro-Par 2012, 2012:246-257.
  • 10Gong X R, Zhao R C, Lu L S. Communication optimization algo- rithms based on extended data flow graph [ C ]. Proceedings of the 8th ACIS International Conference on SSNPD '07,2007:3-8.

引证文献5

二级引证文献8

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部