扩充OpenMP并行编程模型支持事务存储执行(英文) 被引量：1

The extension of OpenMP parallel programming model to support transactional memory execution

下载PDF

导出

摘要虽然OpenMP是多核体系结构上的流行多线程并行编程模型,但是OpenMP编译器不检查数据相关性、访问冲突和其他可能导致程序错误执行的问题,这些问题传统上完全依赖用户使用锁机制来保证程序的正确性.锁机制的并行编程中存在并行程序效率和并行编程难度的矛盾.粒度大的锁机制编程容易,可应用的并行性挖掘比较差;粒度小的锁机制应用的并行性挖掘较好,可编程难度大,容易带来优先权倒置、死锁和锁护航等问题.通过动态二进制插桩技术,扩充OpenMP支持事务存储执行功能,可有效缓解OpenMP并行编程中并行程序效率和并行编程难度之间矛盾. Although OpenMP is the popular multithread programming model on CMP architecture, OpenMP compilers do not check data dependency, memory access confliction and other problems likely to cause program errors. The traditional lock is applied by programmers to guarantee the correctness of their programs. It is easy to write coarse-grain lock programs, but the parallelism of the program may be lost. On the other hand, potential parallelism of a programs can be found by writing fine-grain lock programs, but it may bring about unwanted problems, such as priority inversion, deadlock, etc. Applying binary instrumentation technology to realize the extension of OpenMP to support transactional memory can effectively alleviale the contradiction between the simplicity and productivity in writing parallel OpenMP programs.

作者杨晓奇郑启龙陈国良

机构地区国家高性能中心中国科学技术大学计算机科学与技术系

出处《中国科学技术大学学报》 CAS CSCD 北大核心 2009年第11期1224-1231,共8页 JUSTC

基金 Supported by the National Natural Science Fund Key Project(60533020) Natural Science Found of Anhui Province(090412068) Major Project of National Science and Technology(2009ZX01034-001-001-002)

关键词多核 OPENMP 事务存储执行 CMP OpenMP transactional memory execution

分类号 TP311.1 [自动化与计算机技术—计算机软件与理论] TP338.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献15

1OpenMP Architecture Review Board. OpenMP Application Program Interface [M]. 2005.
2Larus J, Rajwar R. Transactional Memory [M]. CA:Morgan Claypool, 2006.
3Gray J, Reuter A. Transaction Processing: Concepts and Techniques [M]. CA: Morgan Kaufmann Publishers, 1992.
4Rajwar R, Goodman J. Transactional execution: toward reliable, high performance multithreading [J]. IEEE Micro, 2003, 23(6) : 117-125.
5Herlihy M, Moss J E B. Transactional memory: architectural support for lock-free data structures[C]// Proceedings of the 20th Annual International Symposium on Computer Architecture. New York: ACM Press, 1993: 289-300.
6Israeli A, Rappoport L. Disjoint-access parallel implementations of strong shared memory primitives [C]// Proceedings of the13th annual ACM symposium on Principles of Distributed Computing. Los Angeles:ACM Press, 1994:151-160.
7Harris H, Fraser H. Language support for lightweight transactions [ C]// Proceedings of 18th annual SIGPLAN conference on Object-oriented Programming, Systems, Languages, and applications. New York: ACM Press, 2003:388-402.
8Damron P, Fedorova A, Lev Y et al. Hybrid transactional memory [C]// Proceedings of 12th International Conference on Architectural Support for Programming I.anguages and Operating Systems. San Jose: ACM Press, 2006: 336-346.
9Milovanovic M, Ferrer R, Gajinov V, et al. Multithreaded software transactional memory and OpenMP[C]// Proceedings of Memory performance: Dealing with Applications, Systems and Architecture. Brasov, Romania: ACM Press, 2007: 81-88.
10Luk C, Cohn R, Muth R, et al. Pin: building customized program analysis tools with dynamic instrumentation[J]. SIGPLAN Notices, 2005, 40(6):190-200.

同被引文献20

1单莹,吴建平,王正华.基于SMP集群的多层次并行编程模型与并行优化技术[J].计算机应用研究,2006,23(10):254-256. 被引量：25
2ZHE F, FENG Q, ARIE K, et al. GPU cluster for high performance computing [C]∥Proceedings of the ACM/IEEE Conference on Supercomputing, Pittsburgh, Pennsylvania. USA: IEEE Computer Society, 2004: 4-7.
3IROYUKI H T, IROAKI H K. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing [J]. The Journal of Supercomputing, 2006, 36(3): 219-234.
4DOMINIK G, ROBERT S, JAMALUDIN M, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster [J]. Parallel Computing, 2007, 33(10/11): 685-699.
5MICHAEL S, JEREMY E, AVNEESH P, et al. QP: A heterogeneous multi-accelerator cluster [C]∥Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing. Boulder, Colorado, USA: LCI, 2009: 34-41.
6JAMES P, JOHN S, KLAUS S. Adapting a message-driven parallel application to GPU-accelerated clusters [C]∥ Proceedings of the ACM/IEEE Conference on Supercomputing. Austin, Texas, USA : IEEE Computer Society, 2008: 19.
7ONEPPO M. HLSL shader model 4.0 [C]∥ACM SIGGRAPH 2007 Courses. San Diego, California, USA: ACM, 2007: 112-152.
8ERIK L, JOHN N, STUART O, et al. NVIDIA Tesla: A unified graphics and computing architecture [J]. IEEE Micro, 2008, 28(2): 39-55.
9JOHN N, IAN B, MICHAEL G, et al. Scalable parallel programming with CUDA [J]. Queue, 2008, 6(2): 4053.
10MICHAEL M, STEFANUS T, TIBERIU P, et al. Shader algebra [C]∥ ACM SIGGRAPH. Los Angeles, California, USA: ACM, 2004: 787-795.

引证文献1

1杨鑫,许端清,杨冰.基于不规则性的并行计算方法[J].浙江大学学报（工学版）,2013,47(11):2057-2064. 被引量：1

二级引证文献1

1卜祥亮,唐小明,殷君茹,李惺颖.一种面向并行查询的森林资源小班数据划分粒度研究[J].中南林业科技大学学报,2015,35(9):39-44. 被引量：1

1李彤,王黎霞.软件过程中的并行性挖掘[J].计算机应用与软件,2001,18(5):27-31. 被引量：2
2李跃飞,郭君红,白成刚,蔡开元.飞行控制软件测试中的插桩技术[J].北京航空航天大学学报,2009,35(5):580-583. 被引量：6
3乔冰.浅析嵌入式软件可靠性设计[J].电子世界,2013(5):143-144. 被引量：2
4陈昌生,孙永强,何积丰.一个可预测并行程序效率的评价模型[J].软件学报,2000,11(11):1485-1491. 被引量：5
5肖永全,马小平.实时嵌入式系统集成开发环境研究与设计[J].微机发展,2004,14(4):45-47. 被引量：5
6邹阳,吕建,曾晓勤.并行性挖掘的图文法方法[J].小型微型计算机系统,2011,32(2):271-278.
7郭君红,李跃飞,白成刚,蔡开元.飞行控制软件测试中插桩技术的优化方法[J].计算机工程,2010,36(4):20-21. 被引量：3
8舒辉,董鹏程,康绯,黄荷洁.缓冲区溢出攻击的自动化检测方法[J].计算机研究与发展,2012,49(S2):32-38. 被引量：1
9张丹丹,赵鹏,钱跃竑.格子Boltzmann算法并行性能的系统分析[J].计算机应用与软件,2009,26(12):12-15. 被引量：1
10张荣,王曙燕.基于插桩技术的动态测试研究与实现[J].现代电子技术,2011,34(4):50-52. 被引量：4

中国科学技术大学学报

2009年第11期

浏览历史

内容加载中请稍等...

扩充OpenMP并行编程模型支持事务存储执行(英文) 被引量：1

参考文献15

同被引文献20

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史