期刊文献+

并发多播队列的实现框架及其多种实现的性能分析

Framework of Concurrent Multicast Queue and the Performance Analysis of its Several Implementations
下载PDF
导出
摘要 开发易用且高效的并发数据结构对降低并行编程的难度和有效利用并行资源非常重要.针对所提出的易于编程的确定性消息传递多线程编程模型DetMP,除可以基于所提出的单生产多播共享虚拟内存模型(SPMC)实现以外,还可以基于传统的多线程共享虚拟内存模型来实现.为了分析消息通道的实现机制(如数据的存储组织、并发访问的同步控制)对DetMP程序性能的影响,提出一个并发多播队列的框架CMQue,并基于Pthreads实现了6种并发多播队列.我们评估了6种并发多播队列和SPMC通道,结果表明消息通道的实现机制对程序性能影响很大,SPMC通道在CPU核资源充足时具有很好的可伸缩性. Developing efficient and easy-to-use concurrent data structures is very important to the parallel programming and the use of parallel resources. DetMP, a deterministic message passing multi-threaded programming model, can be implemented atop different memory models,including the proposed multicast virtual memory ( SPMC ) or the traditional shared-memory model. In order to analyze the performance of DetMP programs using different implementations of the channel ( such as the arrangement of data, the syn- chronization of concurrent accesses }, this paper presents a framework of concurrent multicast queue, CMQue, and further provides six implementations using Pthreads. We evaluated these six kinds of concurrent multicast queues and SPMC channel with different DetMP applications. Experimental results show that the performances of programs are significantly influenced by the implementation mecha- nisms of channel and the SPMC channel version has good scalability.
作者 张其良 张昱
出处 《小型微型计算机系统》 CSCD 北大核心 2017年第6期1237-1242,共6页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61229201)资助 国家"八六三"高技术研究发展计划项目(2012AA010901)资助
关键词 多播队列 并发数据结构 同步控制 多线程编程模型 生产-消费 multicast queue concurrent data structure synchronization control multithreaded programming model produce-consume
  • 相关文献

参考文献2

二级参考文献22

  • 1McCool M, Reinders J, Robison A D. Structured Parallel Programming: Patterns for Efficient Computation. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2012.
  • 2Artho C, Havelund K, Biere A. High-level data races. In Proc. the 1st International Workshop on Verification and Validation of Enterprise Information Systems, April 2003, pp.82-93.
  • 3Lee E. The problem with threads. Computer, 2006, 39(5): 33-42.
  • 4Lu S, Park S, Seo E, Zhou Y. Learning from mistakes-- A comprehensive study on real world concurrency bug characteristics. In Proc. the 13th International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS), March 2008, pp.329-339.
  • 5Zhang Y, Ford B. A virtual memory foundation for scalable deterministic parallelism. In Proc. the 2nd APSys, July 2011, pp.7:1-7:5.
  • 6Zhang Y, Ford B. Lazy tree mapping: Generalizing and scaling deterministic parallelism. In Proc. the 4th AsiaPacific Workshop on Systems (APSys), July 2013, pp.20:1- 20:7.
  • 7Bienia C, Kumar S, Singh J P et al. The PARSEC benchmark suite: Characterization and architectural implications. In Proc. the 17th PACT, October 2008, pp.72-81.
  • 8Reed E C, Chen N, Johnson R E. Expressing pipeline parallelism using TBB constructs: A case study on what works and what doesn't. In Proc. SPLASH, October 2011, pp.133- 138.
  • 9Liu T, Curtsinger C, Berger E. Dthreads: Efficient deterministic multithreading. In Proc. the 23rd SOSP, Oct. 2011, pp.327-336.
  • 10Aviram A, Weng S C, Hu S, Ford B. Efficient systemenforced deterministic parallelism. In Proc. the 9th OSDI, October 2010, pp.193-206.

共引文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部