面向国产CPU SW-1600的向量重组

DOMESTIC PRODUCED CPU SW-1600 ORIENTED VECTOR REGROUP

下载PDF

导出

摘要由于向量化重组指令比较复杂并且不同指令有不同的延迟,从而难以寻找一种统一高效的向量重组算法。对国产CPUSW-1600提供的移位和插入提取指令进行了分析,提出单独依靠移位或插入提取指令实现向量重组的最优算法,并综合这两类指令实现向量重组的高效算法。最后通过实验证明该算法可以较好地对程序进行向量化,对整型数据的加速比达到7.31,对复杂的双精度浮点型程序的加速比也达到1.83。 Since vectorized regroup instructions are comparatively complex whereas different instructions correspond to different delays, it is hard to find out a uniform and efficient vector regroup algorithm. The paper analyzes shifting and insertion/extraction instructions that are offered by domestic produced CPU SW-1600, and presents an optimal algorithm that only depends on shifting or insertion/extraction instructions to realize vector regroup as well as an efficient algorithm that integrates the two types of instructions to realize vector regroup. At last it is proven by experiments that the algorithms can better vectorize programs. The speedup ratio for integer type values reaches 7.31 while that for complex double precision float type programs reaches 1.83.

作者魏帅赵荣彩姚远

机构地区解放军信息工程大学信息工程学院

出处《计算机应用与软件》 CSCD 2011年第11期230-233,275,共5页 Computer Applications and Software

关键词 SIMD(Single INSTRUCTION MULTIPLE Data) SW-1600 向量重组 SLP SIMD（ Single Instruction Multiple Data） SW-1600 Vector regroup SLP

分类号 TP311 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献12

1Stewart J. An Investigation of SIMD instruction sets [ M ]. University of Bal- larat School of Information Technology and Mathematical Sciences ,2005.
2ICC. http ://icc. gnu. org.
3Free Software Foudation. GCC [CP/OL]. http ://gcc. gnu. org.
4Open64. http://open64. sourceforge. net.
5Tenllado C, Pi - nuel L, Prieto M, et al. Pack transposition : Enhancing superword level parallelism exploitation[ C ]//ParCo, 2005.
6Larsen S, Amarasinghe S. Exploiting superword level parallelism with multimedia instruction sets [ C ]//Proc of the ACM SIGPLAN Conference on Programming Language Design and Implementation ,2000:145 -156.
7Shin J, Chame J, Hall M W. Compiler-controlled caching in superword register files for multimedia extension architectures[ C]//PACT, September, 2002.
8Kudriavtsev A, Kogge P. Generation of permutations for sired processors [C]//Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems, 2005:147 -156.
9Hiroaki T, Takeuchi Y, Ota Y, et al. Pack Instruction Generation for Media Processors Using Multi-valued Decision Diagram [ C ]//CODES + ISSS, October 2006.
10Eichenberger A E, Wu P, O' Brien K. Vectorization for SIMD architectures with alignment constraints [ C ]//Proceeding of PLDI, June 2004.

<12 >

1马红旭.C语言中整型无符号二进制数的表示问题[J].电子技术与软件工程,2015(20):254-254. 被引量：1
2黄雷鸣,熊建英.基于C#技术的通用数据库操作类的实现[J].科技广场,2007(3):127-129.
3孙静霞.补码与计算机中的数据存储[J].科技创新导报,2009,6(6):29-29. 被引量：1
4“私有云”大有作为——VMware云平台搭桥四川电信打造统一高效私有云[J].网络运维与管理,2013(9):20-21.
5唐飞龙,李明禄,潘群华.网格环境下的一种服务模型及其应用[J].计算机工程,2005,31(16):14-16. 被引量：4
6贵州省电子政务外网将覆盖至县级[J].信息系统工程,2009,22(12):142-142.
7戴一岗.基于TCP/IP的记录级通讯平台[J].中国金融电脑,1997,0(8):15-18.
8邹大祥.谈谈指针概念的教学[J].三峡大学学报（人文社会科学版）,1999,22(5):95-96.
9陈龙.计算机通信技术的发展趋势探索[J].中国新通信,2014,16(20):43-43.
10李旭港.关于复杂数据通信网络稳定性的评估研究[J].科技风,2014(19):48-48.

<12 >

计算机应用与软件

2011年第11期

面向国产CPU SW-1600的向量重组

参考文献12

相关作者

相关机构

相关主题

面向国产CPU SW-1600的向量重组

参考文献12

相关作者

相关机构

相关主题

微信扫一扫：分享