摘要
低功耗及廉价性使得异构多核在超级计算机计算资源中占有重要比例.然而,异构多核具有高带宽及松耦合一致性等特点,获得理想的存储及计算性能需要更多地考虑底层硬件细节.实现了一种针对典型的异构多核Cell BE处理器的多级并行模型CellMLP,通过C语言扩展编译指导语句,实现了对数据并行、任务并行以及流水并行编程模型的支持,提高了并行程序生产率.运行支持优化方面,数据并行采用SPE并行数据传输、双缓冲等优化手段来提高数据传输带宽;任务并行使用一种新式混合任务队列以支持异步任务窃取,降低SPE线程间竞争,提高了任务并行的可扩展性;流水并行首次使用阻塞信号传输机制实现SPE线程间的低开销同步操作.实验对Stream,NAS Benchmark及BOTS等应用进行了测试,结果表明,CellMLP可对多种典型并行应用进行高效支持.与目前同类编程模型SARC及CellSs进行性能对比,其结果表明,CellMLP实际数据传输带宽以及非规则应用的支持方面具有明显优势.
Due to its lower power consumption and cost, heterogeneous multi-core makes up a major computing resource in the current supercomputers. However, heterogeneous multi-core processor features high bandwidth and loose memory consistency, programmers pay attention to hardware details to get ideal memory and computation performance. This paper introduces CellMLP, a multi-level parallelism model for Cell BE heterogeneous multi-core processor. Through extending compiler directives based on C, CellMLP supports data parallelism, task parallelism and pipeline parallelism programming model, and improves the programming productivity. In addition, runtime optimizations are used to improve the performance. Parallel SPEs data transfer and double-buffer mechanisms are used to improve memory bandwidth. A novel hybrid task queue is used in task parallelism to support asynchronous work stealing, reduce the contention between SPE threads and increase the scalability of task parallelism. For the pipeline parallelism, low-overhead synchronization operations are firstly implemented utilizing signal channels in Cell BE. Experiments are conducted on Stream, NAS Benchmark, BOTS and other typical irregular applications. Results show that CellMLP can support different typical parallel applications efficiently. Compared with similar programming model SARC and CellSs, CellMLP has obvious advantages in terms of practical data transfer bandwidth as well as the support of irregular applications.
出处
《软件学报》
EI
CSCD
北大核心
2013年第12期2782-2796,共15页
Journal of Software
基金
国家自然科学基金(61303050)
"十二五"国家科技支撑计划(2011BAK08B04)
国家高技术研究发展计划(863)(2011AA01A205)
中国科学院计算机系统结构重点实验室开放课题(CARCH201108)
关键词
异构多核
数据并行
任务并行
流水并行
非规则应用
编译优化
heterogeneous multi-core
data parallelism
task parallelism
pipeline parallelism
irregular applications
compilingoptimization