A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set process...A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set processor(ASIP), which uses TSE algorithm instead of resource-consuming reciprocal and reciprocal square root(RSR) operations.The aim is to give a high performance implementation for MMSE and QRD in one programmable platform simultaneously.Furthermore, instruction set architecture(ISA) and the allocation of data paths in single instruction multiple data-very long instruction word(SIMD-VLIW) architecture are provided, offering more data parallelism and instruction parallelism for different dimension matrices and operation types.Meanwhile, multiple level numerical precision can be achieved with flexible table size and expansion order in TSE ISA.The ASIP has been implemented to a 28 nm CMOS process and frequency reaches 800 MHz.Experimental results show that the proposed design provides perfect numerical precision within the fixed bit-width of the ASIP, higher matrix processing rate better than the requirements of 5G system and more rate-area efficiency comparable with ASIC implementations.展开更多
The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on an...The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on analyzing the basic structure features of Elliptic Curve Cryptography (ECC) algorithms,the parallel schedule algorithm of point addition and doubling is presented.And based on parallel schedule algorithm,the Application Specific Instruction-Set Co-Processor of ECC that adopting VLIW architecture is also proposed in this paper.The coprocessor for ECC is implemented and validated using Altera’s FPGA.The experimental result shows that our proposed coprocessor has advantage in high performance and flexibility.展开更多
With greater flexibility and less cost, there is a trend that application specific instruction set processor(ASIP) will become the alternative implementation style to application of specific integrated circuit(ASIC). ...With greater flexibility and less cost, there is a trend that application specific instruction set processor(ASIP) will become the alternative implementation style to application of specific integrated circuit(ASIC). Architecture model is a key component in ASIP design flow. A novel ASIP model, xpMODEL, was presented. Its key features include: explicit specification of the memory subsystem allowing novel memory organizations and hierarchies; the introduction of meta-operator and instruction behavior extended finite state machine providing xpMODEL with ability to model execution sequencing, inherent parallelism, data/control/structural hazards, and out-of-order execution mode in ASIP. A comparison with other ASIP models shows the superiority of xpMODEL.展开更多
The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications....The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.展开更多
提出一种基于FPGA的专用处理器设计.它是用于高级加密标准的超小面积设计,支持密钥扩展(现在设计为128位密钥),加密和解密.这个设计采用了完全的8位数据路径宽度,创新的字节替换电路和乘累加器结构,在最小规模的Xilinx Spartan II FPGA...提出一种基于FPGA的专用处理器设计.它是用于高级加密标准的超小面积设计,支持密钥扩展(现在设计为128位密钥),加密和解密.这个设计采用了完全的8位数据路径宽度,创新的字节替换电路和乘累加器结构,在最小规模的Xilinx Spartan II FPGA芯片XC2S15上实现了一个高级加密标准AES的专用处理器,使用了不到60%的资源.当时钟为70MHz时,可以达到平均加密解密吞吐量2.1Mb/s.主要应用在把低资源占用,低功耗作优先考虑的场合.展开更多
基金Supported by the Industrial Internet Innovation and Development Project of Ministry of Industry and Information Technology (No.GHBJ2004)。
文摘A Taylor series expansion(TSE) based design for minimum mean-square error(MMSE) and QR decomposition(QRD) of multi-input and multi-output(MIMO) systems is proposed based on application specific instruction set processor(ASIP), which uses TSE algorithm instead of resource-consuming reciprocal and reciprocal square root(RSR) operations.The aim is to give a high performance implementation for MMSE and QRD in one programmable platform simultaneously.Furthermore, instruction set architecture(ISA) and the allocation of data paths in single instruction multiple data-very long instruction word(SIMD-VLIW) architecture are provided, offering more data parallelism and instruction parallelism for different dimension matrices and operation types.Meanwhile, multiple level numerical precision can be achieved with flexible table size and expansion order in TSE ISA.The ASIP has been implemented to a 28 nm CMOS process and frequency reaches 800 MHz.Experimental results show that the proposed design provides perfect numerical precision within the fixed bit-width of the ASIP, higher matrix processing rate better than the requirements of 5G system and more rate-area efficiency comparable with ASIC implementations.
基金supported by the national high technology research and development 863 program of China.(2008AA01Z103)
文摘The requirement of the flexible and effective implementation of the Elliptic Curve Cryptography (ECC) has become more and more exigent since its dominant position in the public-key cryptography application.Based on analyzing the basic structure features of Elliptic Curve Cryptography (ECC) algorithms,the parallel schedule algorithm of point addition and doubling is presented.And based on parallel schedule algorithm,the Application Specific Instruction-Set Co-Processor of ECC that adopting VLIW architecture is also proposed in this paper.The coprocessor for ECC is implemented and validated using Altera’s FPGA.The experimental result shows that our proposed coprocessor has advantage in high performance and flexibility.
基金National Natural Science Foundation of China(No. 60273042)
文摘With greater flexibility and less cost, there is a trend that application specific instruction set processor(ASIP) will become the alternative implementation style to application of specific integrated circuit(ASIC). Architecture model is a key component in ASIP design flow. A novel ASIP model, xpMODEL, was presented. Its key features include: explicit specification of the memory subsystem allowing novel memory organizations and hierarchies; the introduction of meta-operator and instruction behavior extended finite state machine providing xpMODEL with ability to model execution sequencing, inherent parallelism, data/control/structural hazards, and out-of-order execution mode in ASIP. A comparison with other ASIP models shows the superiority of xpMODEL.
基金Supported by the National Natural Science Foundation of China (No. 60236020)the Specialized Research Fund for the Doctoral Program of Higher Education (No. 20050003083)
文摘The rapid development of multimedia techniques has increased the demands on multimedia processors. This paper presents a new design method to quickly design high performance processors for new multimedia applications. In this approach, a configurable processor based on the very long instruction-set word architecture is used as the basic core for designers to easily configure new processor cores for multimedia algorithm. Specific instructions designed for multimedia applications efficiently improve the performance of the target processor. Functions not implemented in the digital signal processor (DSP) core can be easily integrated into the target processor as user-defined hardware to increase the performance. Several examples are given based on the architecture. The results show that the processor performance is enhanced approximately 4 times on the H.263 codec and that the processor outperforms both DSPs and single instruction multiple data (SIMD) multimedia extension architectures by up to 8 times when computing the 2-D-IDCT.
文摘提出一种基于FPGA的专用处理器设计.它是用于高级加密标准的超小面积设计,支持密钥扩展(现在设计为128位密钥),加密和解密.这个设计采用了完全的8位数据路径宽度,创新的字节替换电路和乘累加器结构,在最小规模的Xilinx Spartan II FPGA芯片XC2S15上实现了一个高级加密标准AES的专用处理器,使用了不到60%的资源.当时钟为70MHz时,可以达到平均加密解密吞吐量2.1Mb/s.主要应用在把低资源占用,低功耗作优先考虑的场合.