期刊文献+

面向多核CPU多GPU的节点内并行混合绘制模型 被引量:3

Hybrid Rendering Model for Multi-CPU Multi-GPU Distributed Parallel Rendering Cluster Node
下载PDF
导出
摘要 分布式并行绘制集群节点可以配置多核CPU和多个GPU构建节点内多CPU多GPU系统。现有的节点内并行绘制模型既没有充分发挥多核CPU的强大计算能力,还将绘制、读回和合成阶段串行耦合在一起导致了大量的GPU闲置停顿,严重影响了节点内并行绘制性能。提出了一种节点内高效的并行绘制模型,通过软件绘制与硬件绘制相结合的方法将硬件绘制与图像合成分离,同时利用DMA异步传输机制,构建了节点内绘制、读回和合成三段并行绘制流水线。与现有节点内并行绘制模型相比,并行混合绘制模型不但降低GPU资源闲置率,而且提高了CPU资源使用率。理论分析与实验表明相同应用采用并行混合绘制模型的性能可以达到现有模型的3-4倍,并且具有更好的数据扩展性、性能扩展性。 Distributed parallel rendering cluster nodes can accommodate multi-core CPU and multi-GPU. But the present parallel rendering models of node do not make full use of the multi-core CPU computing power and serially join the rendering, readback and composition stages together. This damages system performance and frequently makes GPUs stall. A novel efficient parallel rendering model was introduced. It decoupled the hardware rendering and composition stage with hybrid rendering. With asynchronous DMA transfer, a parallel rendering pipeline with the three stages in one node was constructed. Comparing with the present models, the model not only decreases GPU stall and improves the multi-core CPU usage. Theoretical analysis and experiment results show that the model performance is 3~4 times of the presents model and has much better data and performance scalability.
出处 《系统仿真学报》 CAS CSCD 北大核心 2012年第1期94-98,112,共6页 Journal of System Simulation
基金 国家"973"项目(2009CB723803) 国家自然科学基金(61170157)
关键词 Multi-GPU MULTI-CPU 分布式并行绘制 异步合成 DMA multi-GPU multi-CPU distributed parallel rendering asynchronous composition DMA
  • 相关文献

参考文献10

  • 1T Fogal, H Childs, S Shankar, J Krueger, R D Bergeron, P Hatcher Large Data Visualization on Distributed Memory Multi-GPU Clusters [C]// Proc. of High Performance Graphics 2010. Switzerland: Eurographics Association, 2010: 57-66.
  • 2Kitware Incorporated. ParaView. [EB/OL]. (2009) [2011 ]. http://www.paraview.org.
  • 3NVIDIA Incorporated. NVIDIA Hybrid_SLI. [EB/OL]. (2010) [2011]. http ://www.nvidia.com/obj ectJhybrid_sli.html.
  • 4AMD Incorporated. AMD CrossFire. [EB/OL]. (2010) [2011]. http://www.amd.com/us/PRODUCTS/ WORKSTATION/ GRAPHICS/ CROSSFIRE-PRO/Pages/crossfire-pro.aspx.
  • 5G Humphreys, M Eldridge, I Buck, G Stoll, M Everett, P Hanrahan. WireGL: A Scalable Graphics System for Clusters [C]// Proc. of SIGGRAPH 2001. USA: ACM, 2001: 129-140.
  • 6G Humphreys, H Mike, T James. Chromium: A Stream- Processing Framework for Interactive Rendering on Clusters [C]// Proc. of SIGGRAPH 2002. USA: ACM, 2002: 693-702.
  • 7Moerschell J Owens. Distributed Texture Memory in a Multi- GPU Environment [J]. Computer Graphics Forum (S0167-7055), 2008, (27): 130-151.
  • 8Lefohn S Sengupta, J Kniss, R Strzodka, J Owens Glift. Generic, Efficient, Random-Access GPU Data Structures [J]. ACM Transactions on Graphics (S0730-0301), 2006, (25): 60-99.
  • 9P Bhaniramka, P C D Robert, S Eilemann. OpenGL Multipipe SDK: A Toolkit for Scalable Parallel Rendering [C]// Proc. of IEEE Visualization 2005. USA: IEEE, 2005: 119-126.
  • 10S Eilemarm, M Maldainya, R Pajarola. Equalizer: A Scalable Parallel Rendering Framework Visualization and Computer Graphics [J]. IEEE Transactions on Graphics (S0730-0301), 2009, (15): 436-452.

同被引文献34

  • 1沈卫超,曹立强,夏芳,宋磊.面向数值模拟数据的HDF5性能优化[J].计算机研究与发展,2012,49(S1):314-318. 被引量:10
  • 2张纯,毛菁霞,张如鸿,吴百锋,彭澄廉,陈泽文,孙晓光.基于图形硬件加速的体绘制关键技术综述[J].计算机工程与设计,2005,26(7):1732-1734. 被引量:5
  • 3Bhaniramka P, Robert P C D, Eilemann S. OpenGL multipipe SDK: a toolkit for scalable parallel rendering [C] // Proceedings of IEEE Visualization. Los Alamitos: IEEE Computer Society Press, 2005:119-126.
  • 4Eilemann S, Makhinya M, Pajarola R. Equalizer; a scalable parallel rendering framework [J]. IEEE Transactions on Visualization and Computer Graphics, 2009, 15(3) : 436-452.
  • 5Zhou K, Hou Q M, Ren Z, et al. RenderAnts: interactive Reyes rendering on GPUs [J]. ACM Transactions on Graphics, 2009, 28(5): Article No. 155.
  • 6Moll L, Heirich A, Shand M. Sepia: scalable 3D eompositing using PCI pamette [C] //Proceedings of IEEE Symposium on Field -Programmable Custom Computing Machines. Los Alamitos: IEEE Computer Society Press, 1999:146-155.
  • 7Lombeyda S, Moll L, Shand M, et al. Scalable interactive volume rendering using Off-The-Shelf components [C] // Proceedings of the IEEE Symposium on Parallel and Large-data Visualization and Graphics. Los Alamitos: IEEE Computer Society Press, 2001:115-121.
  • 8Stoll G, Eldridge M, Patterson D, et al. Lighming-2; a high-performance display subsystem for PC clusters [C] // Proceedings of ACM SIGGRAPH. New York: ACM Press, 2001:141-148.
  • 9Zhang X Y, Ba]a] C, Blanke W. Scalable isosur{ace visualization of massive datasets on COTS clusters [C] // Proceedings of IEEE Symposium on Parallel and Large Data Visualization and Graphics. Los Alamitos: IEEE Computer Society Press, 2001 : 51-58.
  • 10Muraki S, Ogata M, Ma K L, et al. Next-generation visual supereomputing using PC clusters with volume graphics hardware deviees [C]//Proceedings of ACM/IEEE Conference on Supereomputing. New York: ACM Press, 2001:51-95.

引证文献3

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部