Aims to provide the block architecture of CoStar3400 DSP that is a high performance, low power and scalable VLIW DSP core, it efficiently deployed a variable-length execution set (VLES) execution model which utilizes ...Aims to provide the block architecture of CoStar3400 DSP that is a high performance, low power and scalable VLIW DSP core, it efficiently deployed a variable-length execution set (VLES) execution model which utilizes the maximum parallelism by allowing multiple address generations and data arithmetic logic units to execute multiple instructions in a single clock cycle. The scalability was provided mainly in using more or less number of functional units according to the intended application. Low power support was added by careful architectural design techniques such as fine-grain clock gating and activation of only the required number of control signals at each stage of the pipeline. The said features of the core make it a suitable candidate for many SoC configurations, especially for compute intensive applications such as wire-line and wireless communications, including infrastructure and subscriber communications. The embedded system designers can efficiently use the scalability and VLIW features of the core by scaling the number of execution units according to specific needs of the application to effectively reduce the power consumption, chip area and time to market the intended final product.展开更多
Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIM...Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIMD instructions. This paper focuses on SIMD instructions generation for media processors. We present an efficient code optimization approach that is integrated into a retargetable C compiler. SIMD instructions are generated by finding and combining the same operations in programs. Experimental results for the UltraSPARC VIS instruction set show that a speedup factor up to 2.639 is obtained.展开更多
基金the National Natural Science Foundation of China(Grant No.60425413)COMSATS Institute of Information Technology, Pakistan
文摘Aims to provide the block architecture of CoStar3400 DSP that is a high performance, low power and scalable VLIW DSP core, it efficiently deployed a variable-length execution set (VLES) execution model which utilizes the maximum parallelism by allowing multiple address generations and data arithmetic logic units to execute multiple instructions in a single clock cycle. The scalability was provided mainly in using more or less number of functional units according to the intended application. Low power support was added by careful architectural design techniques such as fine-grain clock gating and activation of only the required number of control signals at each stage of the pipeline. The said features of the core make it a suitable candidate for many SoC configurations, especially for compute intensive applications such as wire-line and wireless communications, including infrastructure and subscriber communications. The embedded system designers can efficiently use the scalability and VLIW features of the core by scaling the number of execution units according to specific needs of the application to effectively reduce the power consumption, chip area and time to market the intended final product.
文摘Single instruction multiple data (SIMD) instructions are often implemented in modem media processors. Although SIMD instructions are useful in multimedia applications, most compilers do not have good support for SIMD instructions. This paper focuses on SIMD instructions generation for media processors. We present an efficient code optimization approach that is integrated into a retargetable C compiler. SIMD instructions are generated by finding and combining the same operations in programs. Experimental results for the UltraSPARC VIS instruction set show that a speedup factor up to 2.639 is obtained.