基于嵌套循环分类的并行识别技术被引量：5

Parallelism Recognition Technology Based on Nested Loops Classifying

下载PDF

导出

摘要传统的分布存储并行编译系统大多是在共享存储并行编译系统的基础上开发的.共享存储并行编译系统的并行识别技术适合OpenMP代码生成,实现方式是将所有嵌套循环都按照相同的识别方法进行处理,用于分布存储并行编译系统必然会导致无法高效发掘程序的并行性.分布存储并行编译系统应根据嵌套循环结构的特点进行分类处理,提出适合MPI代码生成的并行识别技术.为解决上述问题,根据嵌套循环的结构和MPI并行程序的特点,提出了一种新的嵌套循环分类方法,并针对不同的嵌套循环分别提出了相应的并行识别技术.实验结果表明,与采用传统并行识别技术的分布存储并行编译系统相比,按照所提方法对嵌套循环进行分类,采用相应并行识别技术的编译系统能够更高效地识别基准程序中的并行循环,自动生成的MPI并行代码其性能加速比提高了20%以上. Existing distributed memory parallelizing compiler systems are mostly developed based on shared systems.The parallelism recognition technologies of shared memory parallelizing compiler systems are suitable for OpenMP code generation.Their implementation is used to recognize all nested loops by the same technology,so that the parallelism cannot be efficiently explored when applying them to distributed memory parallelizing compiler systems.Thus,this paper proposes some parallelism recognition technologies suitable for the MPI code generation for distributed memory parallelizing compiler systems by classifying the nested loops according to their structures.To solve these problems,a new classification method of nested loops is proposed,according to the structure of nested loops and characteristics of MPI parallel program.Corresponding parallelism recognition technologies for different nested loops are also presented,respectively.The experimental results show that compared with the distributed memory parallelizing compiler systems that used existing parallelism recognition technologies,the compiler systems,which use the proposed classification method and the corresponding recognition technologies,can more efficiently recognize parallel nested loops in the benchmark programs,and the performance speedup of the MPI codes automatically increased to more than 20%.

作者赵捷赵荣彩丁锐黄品丰

机构地区解放军信息工程大学信息工程学院

出处《软件学报》 EI CSCD 北大核心 2012年第10期2695-2704,共10页 Journal of Software

基金 "核高基"国家科技重大专项(2009ZX01036-001-001-2)

关键词并行编译并行识别嵌套循环模型法遍历法交互法 parallelizing compiler parallelism recognition nested loops model algorithm traverse algorithm interaction algorithm

分类号 TP314 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献12

1Zou DQ, He LG, Jin H, Chen XG. CRBAC: Imposing multi-grained constraints on the RBAC model in the multi-application environment. Journal of Network and Computer Applications, 2009,32(2):402-411. [doi: 10.1016/j.jnca.2008.02.015].
2Hall MW, Amarasinghe SP, Murphy BR, Liao S, Lam MS. Interprocedural parallelization analysis in SUIF. ACM Trans. on Programming Languages and Systems, 2005,27(4):662-731. [doi: 10.1145/1075382.1075385].
3Lin M, Yu ZY, Zhang D, Zhu YM, Wang SY, Dong Y. Retargeting the Open64 compiler to PowerPC processor. In: Proc. of the Embedded Software and Systems Symposia. San Francisco: IEEE Computer Society Press, 2008. 152-157. http://doi.ieeecomputer society.org/10.1109/ICESS.Symposia.2008.69 [doi: 10.1109/ICESS.Symposia.2008.69].
4Kwon D, Hart S, Kim H. MPI backend for an automatic parallelizing compiler. In: Proc. of the 14th Int'l Symp. on Parallel Architectures, Algorithms and Networks. San Francisco: IEEE Computer Society Press, 1999. 152-157. http://doi.ieeecomputer society.org/10.1109/ISPAN. 1999.778932 [doi: 10.1109/ISPAN. 1999.778932].
5Ding Y, Kandemir M, Irwin MJ, Raghavan P. Adapting application mapping to systematic within-die process variations on chip multiprocessors. Lecture Notes in Computer Science, 2009,5409(1);231-247. [doi: 10.1007/978-3-540-92990-1_18].
6Ferner CS. The paraguin compiler--Message-Passing code generation using SUIF. In: Proc. of the IEEE SoutheastCon 2002. Piscataway: IEEE Press, 2002. 1-6. [doi: 10.1109/.2002.995545].
7Allen R, Kennedy K. Optimizing Compilers for Modem Architectures: A Dependence-Based Approach. Morgan Kaufmann Publishers, 2001.
8Hu C J, Li J, Wang J, Yao GL, Li YH, Ding L, Li JJ. Communication set generation for a special case of irregular parallel applications. Chinese Journal of Computers, 2008,31(1): 120-126 (in Chinese with English abstract).
9BastoulC, Cohen A, Girbal S, Sharma S, Temam O. Putting polyhedral loop transformations to work. In: Rauchwerger L, ed. Proc. of the 16th Int'l Workshop on Languages and Compilers for Parallel Computing. Berlin: Springer-Verlag, 2004. 209-225. [doi: 10. 1007/978-3-540-24644-2_14].
10Hoefler T, Lumsdaine A, Dongarra J. Towards efficient map reduce using MPI. In: Ropo M, Westerholm J, Dongarra J, eds. Proc. of the 16th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface. Berlin: Springer-Verlag, 2009. 240-249. [doi: 10.1007/978-3-642-03770-2_30].

同被引文献50

1张海荣,朱信忠,赵建民,徐慧英.一种优化的基于用户聚类的过滤推荐策略[J].计算机系统应用,2008,17(11):95-97. 被引量：6
2夏军,杨学军.基于数据空间融合的全局计算与数据划分方法[J].软件学报,2004,15(9):1311-1327. 被引量：7
3王轶然,陈莉,冯晓兵,张兆庆.全局部分重复计算划分[J].计算机研究与发展,2006,43(12):2158-2165. 被引量：2
4Feautrier P. Dataflow analysis of array and scalar references[ J]. In- ternational Journal of Parallel Programming, 1991,2 ( 1 ) :23-53.
5Maydan D E, Amarasinghe S P, Lain M S. Army data-flow analysis and its use in array privatization[ C]. Proceedings of the 20th ACM SIGPLAN-SIGACT Symposium on POPL '93,1993:2-15.
6Tomofumi Y, Sanjay R. Canonic multi-projection:memory alloca- tion for distributed memory parallelization [ R]. CSl1-106, Colo- rado: Colorado State University ,2011.
7Wang S S, Zhao R C, Pang J M. Improvement and implementation of accurate array data-flow analysis [ C ]. Proceedings of The 2006 International Conference on Parallel & Distributed Processing Tech- niques and Applications & Conference on Real - Time Computing Systems & Applications (PDPTA'06) ,2006.
8Gu J ,Li Z. Efficient interprocedural array data-flow analysis for au- tomatic program parallelization[ J]. IEEE Transactions on Software Engineering, 2000,26 ( 3 ) : 244-261.
9Bosilca G,Bouteiller A,Danalis A,et al. From serial loops to paral- lel execution on distributed systems [ C ]. In: Kaklamanis C, Papa- theodorou T, Spirakis P G ed. Parallel Processing, Proceedings of 18th Euro-Par 2012, 2012:246-257.
10Gong X R, Zhao R C, Lu L S. Communication optimization algo- rithms based on extended data flow graph [ C ]. Proceedings of the 8th ACIS International Conference on SSNPD '07,2007:3-8.

引证文献5

1丁锐,赵荣彩,韩林.一种基于数组生命期的数据分解算法[J].软件学报,2013,24(12):2843-2858.
2丁锐,赵荣彩,赵捷.一种面向划分的数组数据流分析方法[J].小型微型计算机系统,2014,35(3):532-537.
3赵捷,赵荣彩,韩林,李宝亮.面向MPI代码生成的Open64编译器后端[J].计算机学报,2014,37(7):1620-1632. 被引量：5
4刘建粉,史永昌.基于用户兴趣分类优化的聚类模型仿真[J].微电子学与计算机,2014,31(5):171-174. 被引量：2
5Xinbiao GAN,Yikun HU,Jie LIU,Lihua CHI,Han XU,Chunye GONG,Shengguo LI,Yihui YAN.Customizing the HPL for China accelerator[J].Science China(Information Sciences),2018,61(4):101-111. 被引量：1

二级引证文献8

1董跃华,刘力.基于权衡因子的决策树优化算法[J].江西理工大学学报,2015,36(5):90-97.
2李凯凯,宋礼鹏.基于社交网络的用户行为记忆性研究[J].微电子学与计算机,2017,34(3):133-135. 被引量：4
3单征,王洋,孟曦,闫丽景.一种源码级的上下文敏感性检测算法[J].计算机应用研究,2017,34(5):1388-1392.
4王冬,赵荣彩,高伟,李雁冰.基于随机决策森林的循环展开方法[J].计算机工程与设计,2018,39(1):199-204. 被引量：2
5张敬一,刘志佳,张锐,毛双兰.一种光学遥感小卫星数传基带测试系统设计改进[J].航天器工程,2018,27(3):127-134. 被引量：2
6李雁冰,赵荣彩,韩林,赵捷,徐金龙,李颖颖.一种面向异构众核处理器的并行编译框架[J].软件学报,2019,30(4):981-1001. 被引量：7
7高雨辰,赵荣彩,韩林,李雁冰.循环自动并行化技术研究[J].信息工程大学学报,2019,20(1):82-89. 被引量：1
8Ruibo Wang,Kai Lu,Juan Chen,Wenzhe Zhang,Jinwen Li,Yuan Yuan,Pingjing Lu,Libo Huang,Shengguo Li,Xiaokang Fan.Brief Introduction of TianHe Exascale Prototype System[J].Tsinghua Science and Technology,2021,26(3):361-369. 被引量：5

1肖侬,胡守仁,韩冰,宋辉.面向对象C++并行编译系统的总体设计和实现[J].计算机研究与发展,1997,34(4):292-297. 被引量：1
2鹿琛,王姗珊.基于BP神经网络的车牌字符识别方法[J].山东农业大学学报（自然科学版）,2017,48(1):113-116. 被引量：7
3金晶,郭琳.新型PID参数选取方法在加热炉控制中的应用[J].工业仪表与自动化装置,2010(1):92-93. 被引量：1
4白乐强,杨晰.并行识别RFID自适应多叉树防碰撞算法[J].微电子学与计算机,2015,32(1):136-139. 被引量：2
5孙玉强,王明斐,孙富琴,顾玉宛.语法分析存储结构的分析与操作算法[J].福建电脑,2007,23(2):22-22.
6晁智强,毛飞跃,李华莹,刘相波,韩寿松.基于可操作度的机械臂尺寸优化设计[J].山西电子技术,2017(2):29-32. 被引量：2
7龚雪容,陆林生,赵荣彩.并行识别中的依赖关系与通信优化研究[J].计算机应用,2007,27(B12):9-11. 被引量：2
8杨海涛,宋志娟,孙燕.不用指针实现图的表示及遍历方法探讨[J].内蒙古民族师院学报（自然科学版）,1999,14(2):117-119. 被引量：1
9王振宇,王义和,郭福顺.并行循环的识别[J].哈尔滨工业大学学报,1992,24(1):40-46.
10陈琳.全局指纹分类与海量识别的研究及应用[J].福建电脑,2016,32(11):157-158.

软件学报

2012年第10期

浏览历史

内容加载中请稍等...

基于嵌套循环分类的并行识别技术被引量：5

参考文献12

同被引文献50

引证文献5

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于嵌套循环分类的并行识别技术 被引量：5

参考文献12

同被引文献50

引证文献5

二级引证文献8

相关作者

相关机构

相关主题

浏览历史

基于嵌套循环分类的并行识别技术被引量：5