摘要
分布式并行绘制集群节点可以配置多核CPU和多个GPU构建节点内多CPU多GPU系统。现有的节点内并行绘制模型既没有充分发挥多核CPU的强大计算能力,还将绘制、读回和合成阶段串行耦合在一起导致了大量的GPU闲置停顿,严重影响了节点内并行绘制性能。提出了一种节点内高效的并行绘制模型,通过软件绘制与硬件绘制相结合的方法将硬件绘制与图像合成分离,同时利用DMA异步传输机制,构建了节点内绘制、读回和合成三段并行绘制流水线。与现有节点内并行绘制模型相比,并行混合绘制模型不但降低GPU资源闲置率,而且提高了CPU资源使用率。理论分析与实验表明相同应用采用并行混合绘制模型的性能可以达到现有模型的3-4倍,并且具有更好的数据扩展性、性能扩展性。
Distributed parallel rendering cluster nodes can accommodate multi-core CPU and multi-GPU. But the present parallel rendering models of node do not make full use of the multi-core CPU computing power and serially join the rendering, readback and composition stages together. This damages system performance and frequently makes GPUs stall. A novel efficient parallel rendering model was introduced. It decoupled the hardware rendering and composition stage with hybrid rendering. With asynchronous DMA transfer, a parallel rendering pipeline with the three stages in one node was constructed. Comparing with the present models, the model not only decreases GPU stall and improves the multi-core CPU usage. Theoretical analysis and experiment results show that the model performance is 3~4 times of the presents model and has much better data and performance scalability.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2012年第1期94-98,112,共6页
Journal of System Simulation
基金
国家"973"项目(2009CB723803)
国家自然科学基金(61170157)