共享内存结构并行程序的编译器优化被引量：1

Compiler Optimization Algorithm for Parallel Program on Shared Memory Machine

下载PDF

导出

摘要共享内存结构上的程序自动并行化通常实现循环级并行,采用fork-join执行模式,并行性能有待提高。论文结合fork-join和SPMD两种执行模式的优势,在并行化编译过程中通过并行区合并和扩展,实现fork-join和SPMD混合执行模式,并在SPMD并行区中实现了基于跨处理器相关图的barrier同步优化。分析验证表明,这些优化策略减少了并行区和barrier同步的数目,有效地提高了生成并行程序的性能。 Automatic parallelization on shared-memory machine is mostly loop-level and uses fork-join model.The performance of these program isn＇t satisfying.In this paper,a hybrid programming model is employed to combine the flexibility of the fork-join model with the power of the single program multiple data（SPMD）model by parallel region extension and combination.And a barrier synchronization optimization algorithm based on cross-processor dependence graph is developed to eliminate redundant barriers in each SPMD parallel region.Analysis show that these strategies reduce the number of parallel region and barrier synchronization,and can improve the performance of generated parallel codes.

作者张平李清宝赵荣彩

机构地区解放军信息工程大学信息工程学院

出处《计算机工程与应用》 CSCD 北大核心 2006年第1期13-16,共4页 Computer Engineering and Applications

基金国家部委重点科研资助项目

关键词跨处理器相关 barrier同步 SPMD并行区数据相关图 cross-processor dependence,barrier synchronization,SPMD parallel region,data dependence graph

分类号 TP31 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1Chau-Wen Tseng.Compiler Optimizations for Eliminating Barrier Synchronization[C].In:the 5th ACM Symposium on Principles and Practice of Parallel Programming(PPOPP'95) ,Santa Barbara,CA, 1995.
2G'eraud Krawezik ,Franck Cappello.Performance Comparison of MPI and three OpenMP Programming Styles on Shared Memory Multiprocessors[C].In : SPAA'03, San Diego, California, USA, 2003.
3张平,赵荣彩,李清宝.基于相关性的同步优化算法[J].计算机工程,2005,31(17):68-70. 被引量：5
4Robert P Wilson,Robert S French,Christopher S Wilson.SUIF:An infrastructure for Research on Parallelizing and Optimizing Compilers. US : Computer Systems Laboratory Stanford University, 1994.
5Hwansoo Hart et al,Compiler-parallelized Codes of Software DSMs, US : University of Maryland.
6Amy W Lim,Monica Lain.Maximizing Parallelism and Minimizing Synchronization with Affine Transforms[C].In:24th Annual ACM SIG-PLAN-SIGACT Symposium, 1997.

二级参考文献5

1OpenMP C Application Program Interface (Version 2.0). htto://www. Openmp.org, 2000-11.
2Appelbe B, Doddapaneni S, Hardnett C. A New Algorithm for Global Optimization for Parallelism and Locality. 7th International Workshop on Languages and Compilers for Parallel Computing, 1994-08.
3Chen D K. Compiler Optimizations for Parallel Loops with Fine-Grained Synchronization(Ph. D. Dissertation). University of Illinois at Urbana-Champaign, 1994.
4Gupta M, Schonberg E. Static Analysis to Reduce Synchronization Costs in Data-parallel Programs. In: Proc. of Principles of Programming Languages, 1996-01.
5Tseng C W. Compiler Optimizations for Eliminating Barrier Synchronization. In: Proc. of Principles and Practice of Parallel Programming , 1995-08.

共引文献4

1张平,李清宝,赵荣彩.OpenMP并行程序的编译器优化[J].计算机工程,2006,32(24):37-40. 被引量：13
2闫昭,刘磊.基于数据依赖关系的程序自动并行化方法[J].吉林大学学报（理学版）,2010,48(1):94-98. 被引量：12
3王凯,杨剑锋,郭成城,于银菠.共享内存并行编程最优同步方法的研究[J].科学技术与工程,2015,35(8):99-102. 被引量：3
4孙雷,罗强,潘毅,冯洋.基于GA-SVR的CO_2驱原油最小混相压力预测模型[J].大庆石油地质与开发,2017,36(3):123-129. 被引量：8

同被引文献12

1孙玉强,刘三阳,张英丽,马正华.FIRST和FOLLOW集合的并行算法设计[J].计算机工程,2004,30(21):71-73. 被引量：1
2尉红梅,姚建华.并行语言及编译技术现状和发展趋势[J].计算机工程,2004,30(B12):97-98. 被引量：5
3曾绍华,魏延.共享存储器多处理机并行计算编译及调度机制[J].重庆师范大学学报（自然科学版）,2006,23(1):27-30. 被引量：5
4董春丽,韩林,赵荣彩.并行编译中一种线性数据和计算划分算法[J].计算机工程,2006,32(24):26-28. 被引量：5
5沈志宇,胡子昂.并行编译方法[M].北京:国防工业出版社,2001.
6HUANG JIN-WOEI, CHU PCHIH-PING. An Efficient Communicatin Scheduling Method for the Processor Mapping Technique Applied Data Redistribution [J]. Journal of Supercomputing, 2006, 37 (3):297-318.
7STOJMENOVCI I, SEDDIGH M, ZUNIC J. Dominating Sets and Neighbor Elimination-Based Broadcasting Algorithms in Wireless Networks [ J]. IEEE Trans on Parallel and Distributed Systems, 2002, 13 (1) : 14-25.
8G·ERAUD KRAWEZIK, FRANCH C. Performance Comparison of MPI and Three OpenMP Programming Styles on Shared Memory Multiprocessors [ C ] //SPAA'03. SanDiego, California, USA : ACM, 2003 : 118-127.
9LI Jian-hui, ZANG Bin-yu, WU Rong, et al. Run-Time Data-Flow Analysis [ J]. Journal of Computer Science.and Technol- ogy, 2002, 17 (4): 442-449.
10MATTHEW C CHILDESTER, ALAN D GEORGE, MATTHEW A, et al. Multiple-Path Execution for Chip Multiprocessers [ J]. Journal of Systems Architecture: the EUROMICRO Journal, 2003, 49 (1/2) : 33-52.

引证文献1

1闫昭,刘磊.基于多线程LL(1)分析表自动生成的并行算法[J].吉林大学学报（信息科学版）,2009,27(1):85-89. 被引量：1

二级引证文献1

1刘威,路来君,王洪肖,曹延波.基于G^4 ICCS系统的数据挖掘并行算法[J].吉林大学学报（信息科学版）,2013,31(3):324-327. 被引量：3

1张平,李清宝,赵荣彩.OpenMP并行程序的编译器优化[J].计算机工程,2006,32(24):37-40. 被引量：13
2黄万荣,唐玉华,易晓东.面向流处理结构的Barrier同步实现[J].计算机研究与发展,2014,51(S1):245-250. 被引量：1
3计算机工程[J].中国学术期刊文摘,2006,12(18):198-199.
4张平,赵荣彩,李清宝.基于相关性的同步优化算法[J].计算机工程,2005,31(17):68-70. 被引量：5
5唐新春,郭克榕.面向MPPFortran的程序自动并行化[J].计算机研究与发展,1996,33(8):566-573.
6李清宝,张平.基于分布/共享内存层次结构的并行程序设计[J].计算机应用,2004,24(6):148-150. 被引量：10
7张平,赵荣彩,李清宝,董春丽.共享内存结构OpenMP并行程序的自动生成[J].计算机科学,2004,31(12):189-191.
8张艳,李延红.一个调度Fork-Join任务图的新算法[J].计算机工程与科学,2007,29(4):64-67.
9徐学雷,郑大钟.一类Fork－Join排队系统的分析[J].控制理论与应用,1994,11(3):361-365. 被引量：1
10张瑜,黄波,朱传琪.程序自动并行化系统中IR的面向对象设计[J].计算机工程,1999,25(11):5-7. 被引量：3

计算机工程与应用

2006年第1期

浏览历史

内容加载中请稍等...

共享内存结构并行程序的编译器优化被引量：1

参考文献6

二级参考文献5

共引文献4

同被引文献12

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

共享内存结构并行程序的编译器优化 被引量：1

参考文献6

二级参考文献5

共引文献4

同被引文献12

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

共享内存结构并行程序的编译器优化被引量：1