期刊文献+

扩充OpenMP并行编程模型支持事务存储执行(英文) 被引量:1

The extension of OpenMP parallel programming model to support transactional memory execution
下载PDF
导出
摘要 虽然OpenMP是多核体系结构上的流行多线程并行编程模型,但是OpenMP编译器不检查数据相关性、访问冲突和其他可能导致程序错误执行的问题,这些问题传统上完全依赖用户使用锁机制来保证程序的正确性.锁机制的并行编程中存在并行程序效率和并行编程难度的矛盾.粒度大的锁机制编程容易,可应用的并行性挖掘比较差;粒度小的锁机制应用的并行性挖掘较好,可编程难度大,容易带来优先权倒置、死锁和锁护航等问题.通过动态二进制插桩技术,扩充OpenMP支持事务存储执行功能,可有效缓解OpenMP并行编程中并行程序效率和并行编程难度之间矛盾. Although OpenMP is the popular multithread programming model on CMP architecture, OpenMP compilers do not check data dependency, memory access confliction and other problems likely to cause program errors. The traditional lock is applied by programmers to guarantee the correctness of their programs. It is easy to write coarse-grain lock programs, but the parallelism of the program may be lost. On the other hand, potential parallelism of a programs can be found by writing fine-grain lock programs, but it may bring about unwanted problems, such as priority inversion, deadlock, etc. Applying binary instrumentation technology to realize the extension of OpenMP to support transactional memory can effectively alleviale the contradiction between the simplicity and productivity in writing parallel OpenMP programs.
出处 《中国科学技术大学学报》 CAS CSCD 北大核心 2009年第11期1224-1231,共8页 JUSTC
基金 Supported by the National Natural Science Fund Key Project(60533020) Natural Science Found of Anhui Province(090412068) Major Project of National Science and Technology(2009ZX01034-001-001-002)
关键词 多核 OPENMP 事务存储执行 CMP OpenMP transactional memory execution
  • 相关文献

参考文献15

  • 1OpenMP Architecture Review Board. OpenMP Application Program Interface [M]. 2005.
  • 2Larus J, Rajwar R. Transactional Memory [M]. CA:Morgan Claypool, 2006.
  • 3Gray J, Reuter A. Transaction Processing: Concepts and Techniques [M]. CA: Morgan Kaufmann Publishers, 1992.
  • 4Rajwar R, Goodman J. Transactional execution: toward reliable, high performance multithreading [J]. IEEE Micro, 2003, 23(6) : 117-125.
  • 5Herlihy M, Moss J E B. Transactional memory: architectural support for lock-free data structures[C]// Proceedings of the 20th Annual International Symposium on Computer Architecture. New York: ACM Press, 1993: 289-300.
  • 6Israeli A, Rappoport L. Disjoint-access parallel implementations of strong shared memory primitives [C]// Proceedings of the13th annual ACM symposium on Principles of Distributed Computing. Los Angeles:ACM Press, 1994:151-160.
  • 7Harris H, Fraser H. Language support for lightweight transactions [ C]// Proceedings of 18th annual SIGPLAN conference on Object-oriented Programming, Systems, Languages, and applications. New York: ACM Press, 2003:388-402.
  • 8Damron P, Fedorova A, Lev Y et al. Hybrid transactional memory [C]// Proceedings of 12th International Conference on Architectural Support for Programming I.anguages and Operating Systems. San Jose: ACM Press, 2006: 336-346.
  • 9Milovanovic M, Ferrer R, Gajinov V, et al. Multithreaded software transactional memory and OpenMP[C]// Proceedings of Memory performance: Dealing with Applications, Systems and Architecture. Brasov, Romania: ACM Press, 2007: 81-88.
  • 10Luk C, Cohn R, Muth R, et al. Pin: building customized program analysis tools with dynamic instrumentation[J]. SIGPLAN Notices, 2005, 40(6):190-200.

同被引文献20

  • 1单莹,吴建平,王正华.基于SMP集群的多层次并行编程模型与并行优化技术[J].计算机应用研究,2006,23(10):254-256. 被引量:25
  • 2ZHE F, FENG Q, ARIE K, et al. GPU cluster for high performance computing [C]∥Proceedings of the ACM/IEEE Conference on Supercomputing, Pittsburgh, Pennsylvania. USA: IEEE Computer Society, 2004: 4-7.
  • 3IROYUKI H T, IROAKI H K. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing [J]. The Journal of Supercomputing, 2006, 36(3): 219-234.
  • 4DOMINIK G, ROBERT S, JAMALUDIN M, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster [J]. Parallel Computing, 2007, 33(10/11): 685-699.
  • 5MICHAEL S, JEREMY E, AVNEESH P, et al. QP: A heterogeneous multi-accelerator cluster [C]∥Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing. Boulder, Colorado, USA: LCI, 2009: 34-41.
  • 6JAMES P, JOHN S, KLAUS S. Adapting a message-driven parallel application to GPU-accelerated clusters [C]∥ Proceedings of the ACM/IEEE Conference on Supercomputing. Austin, Texas, USA : IEEE Computer Society, 2008: 19.
  • 7ONEPPO M. HLSL shader model 4.0 [C]∥ACM SIGGRAPH 2007 Courses. San Diego, California, USA: ACM, 2007: 112-152.
  • 8ERIK L, JOHN N, STUART O, et al. NVIDIA Tesla: A unified graphics and computing architecture [J]. IEEE Micro, 2008, 28(2): 39-55.
  • 9JOHN N, IAN B, MICHAEL G, et al. Scalable parallel programming with CUDA [J]. Queue, 2008, 6(2): 4053.
  • 10MICHAEL M, STEFANUS T, TIBERIU P, et al. Shader algebra [C]∥ ACM SIGGRAPH. Los Angeles, California, USA: ACM, 2004: 787-795.

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部