期刊文献+

从SSE到OpenCL:多核CPU上骨骼动画并行算法对比研究

From SSE to Open CL: Comparison of Parallel Algorithms for Skeletal Animation on Multi-core CPUs
下载PDF
导出
摘要 拥有高精度蒙皮和复杂骨骼绑定关系的骨骼动画,渲染时存在很大的性能瓶颈。以往研究采用GPU加速动画,但高端GPU成本过高,而中低端GPU的通用计算性能有时不如高端CPU。为充分挖掘算法在多核CPU上的执行性能,弥补中低端GPU通用计算性能的不足,提出了基于Open CL的针对指令和线程的新兴集成并行方案,并与基于SSE结合Open MP针对指令和线程的传统独立并行方案展开对比。实验结果表明,在多数CPU和多种复杂度的数据上,基于Open CL的新兴并行方案的性能明显高于基于SSE的传统并行方案,并且性能优势随着数据复杂度的增加而提升。 While skeletal animations consist of high fidelity mesh and complex vertex binding, there is a huge performance bottleneck. Previous studies used GPUs to accelerate animation, but the general purpose computing performance of cheap GPUs was lower than high end CPUs. In order to explore the performance for multi-core CPUs to make up the shortfall of cheap GPUs for general purpose computing, the latest parallel scheme based on Open CL which integrated the former two levels of parallelization was proposed, comparing to the traditional parallel scheme formed by SSE and Open MP oriented to instruction parallelization and thread parallelization separately. The experimental results show that on most CPU with different complex data, the performance of the parallel scheme based on Open CL is better than SSE. And the more complex the data is, the greater the performance advantage grows.
出处 《系统仿真学报》 CAS CSCD 北大核心 2015年第2期336-343,351,共9页 Journal of System Simulation
基金 国家海洋局数字海洋科学技术重点实验室开放基金(KLDO201303)
关键词 骨骼动画 并行计算 OPEN CL SSE skeletal animation parallel computing Open CL SSE
  • 相关文献

参考文献12

  • 1Jie S,Jianbin F,Sips H, et al.Performance Gaps between OpenMP and OpenCL forMulti-core CPUs. Parallel Processing Workshops (ICPPW),201241st InternationalConference on . 2012
  • 2N. Burtnyk,M. Wein.Interactive skeleton techniques for enhancing motion dynamics in key frame animation[J]. Communications of the ACM . 1976 (10)
  • 3郝爱民,赵永涛,吴伟和,朱诗武.任意姿态虚拟人网格模型骨骼提取算法[J].中国图象图形学报,2011,16(6):1008-1014. 被引量:3
  • 4季卓尔,张景峤.基于可编程GPU的骨骼动画[J].计算机工程与应用,2008,44(22):77-80. 被引量:1
  • 5Shi G,Li M,Lipasti M.Accelerating Search and Recognition Workloads with SSE 4.2 String and Text Processing Instructions. Performance Analysis of Systems and Software (ISPASS),2011 IEEE International Symposium on . 2011
  • 6Kavan L.Real-time Skeletal Animation. . 2007
  • 7Khronos.The Open CL Specification. http://www.khronos.org/r egistry/cl/specs/opencl-1.2.pdf . 2012
  • 8Ismail, Leila,Guerchi, Driss.Performance evaluation of convolution on the Cell Broadband Engine processor. IEEE Transactions on Parallel and Distributed Systems . 2011
  • 9赵维,谢晓方.虚拟人技术发展现状及其在工程中的应用[J].系统仿真学报,2009,21(17):5473-5476. 被引量:8
  • 10Lindholm E,Kilgard M J,Henry M.A User-Programmable Vertex Engine. Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH 2001 . 2001

二级参考文献33

共引文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部