期刊文献+

SIMD技术与向量数学库研究 被引量:10

Research of the SIMD and Vector Math Library
下载PDF
导出
摘要 首先,结合Intel,AMD和IBM处理器,介绍了单指令流多数据流(SIMD)向量化技术及其各自的特点。其次,在3种平台上对各自开发的函数库中的部分向量数学函数进行了测试。结果表明,相对传统的标量计算,向量化技术带来的加速比较高,特别是Cell SDK函数,因其独特的体系结构,多个向量处理单元带来的平均加速比为10。最后,通过测试结果的对比,发现不同数学库中的向量函数之间在性能方面也存在着差异,并对差异原因进行了分析,得出性能差异主要是处理器架构和向量计算单元个数和访存等因素造成的。 Firstly,we introduced the single Instruction Multiple Data(SIMD) vectorization technology and the features separately,based on the processors of Intel AMD and IBM Cell.Secondly,some vectorization functions were tested in these three platforms,which were deve-loped by the three vendors separately.Our test results show that we achieve high performance with the technology of the vectorization,compared to the traditional methods of the scalar calculation.Especially,the speedup of the Cell SDK functions is 10 on average,which were achieved by the help of many processing elements and the special processor structure.Lastly,we also found that there are some differences between the vectorial functions,which are in different vector math libraries.We analyzed that there are some reasons caused the difference between the math functions in different platforms,such as processor structure,the number of the processing elements,memery accessing and so on.
出处 《计算机科学》 CSCD 北大核心 2011年第7期298-301,共4页 Computer Science
基金 国家863项目(2006AA01A125 2009AA01A129 2009AA01A134) 国家自然科学基金项目(60303032) 国家自然基金重点项目(60533020)资助
关键词 向量化 SSE MMX 3DNow! SIMD Vectorization SSE MMX 3DNow! SIMD
  • 相关文献

参考文献10

  • 1Chiu Jihching, Chou Yuliang, Hua Yitzeng. A Multi-streaming SIMD Architecture for Multimedia Applications[A]//CF '09 : Proceedings of the 6th ACM conference on Computing frontiers, 2009[C]. New York.. ACM, 2009 : 51-60.
  • 2Parhami B. SIMD Machines..DO They Have a Significant Future [J]. SIGARCH Computer Architecture News, 1995,23 (4) : 19- 22.
  • 3郑纬民.计算机系统结构[M].北京:清华大学出版社,2005:451-479.
  • 4Dersch H. Universal SIMD-Mathlibrary[EB/OL]. (2008 08- 20). http://webuser, fh-furtwangen, de/M 7Edersch/, 2010-6- 30.
  • 5Alex Fr,Introduction to MMX Programming [EB/OL]. (2003- 07-08)E2010-6-303. http://www, codeproject, com/mmxintro. aspx.
  • 6徐晟.cell/BE处理器编程手册[M].北京:电子工业出版社,2009:10-35.
  • 7刘远,张定华,赵歆波,毛海鹏,刘晓鹏.一种基于SIMD技术的快速并行代数重建算法[J].中国图象图形学报,2007,12(1):73-77. 被引量:8
  • 8IBM Systems and Technology Group, SIMI) Math Library Spec ification for Cell Broadband Engine Architecture[EB/OL]. ht- tps://www-01, ibm. eom,2010-6-30.
  • 9Furtak T, Amaral J N, Niewiadomski R. Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sor- ting Algorithms EA]//Proceedings of the nineteenth annual ACM symposium on Parallel algorithms and architectures. 2007 [C]. New York: ACM, 2007 : 348-357.
  • 10AMD Corp, Core Math Library, Version 4. 4. 0[EB/OL]. http://developer, amd. eom/cpu/libraries, 2010-6-30.

二级参考文献6

  • 1Tam K C,Eberhard J W,Mitchell K W.Incomplete-data CT image reconstructions in industrial applications[J].IEEE Transactions on Nuclear Science,1990,37(3):1490- 1499.
  • 2IA-32 Intel.Architecture Software Developer's Manual[EB/OL].http://www.Intel.com,2004 -08-16/2004-12-07.
  • 3Yan Yang,Allen Tannenbaum,Don Giddens.Knowledge-based 3D segmentation and reconstruction of coronary arteries using CT images[A].In:Proceedings of International Conference of the IEEE Engineering in Medicine and Biology EMBC[C],2004,26 Ⅲ:1664 - 1666.
  • 4孙晓安,陈淑珍,吴志斌,柴亚萍.图象重建中的最优化方法[J].中国图象图形学报(A辑),1999,4(2):105-109. 被引量:6
  • 5李春芳,张新峰,潘金虎,是度芳.改进的联合代数重建法及其有限角投影重建[J].光电子.激光,2002,13(7):726-729. 被引量:12
  • 6是度芳.有限角CT少数投影重建图像技术[J].量子电子学报,2004,21(2):168-172. 被引量:8

共引文献8

同被引文献45

引证文献10

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部