This paper proposes a flexible eight-mode high parallel Galois SIMD ASIP(Application Specific Instruction Set Processor).It supports parallel executions of Gold,Scrambling,CRC,CC,Turbo,RM,PSS,SSS encoding LFSR(linear ...This paper proposes a flexible eight-mode high parallel Galois SIMD ASIP(Application Specific Instruction Set Processor).It supports parallel executions of Gold,Scrambling,CRC,CC,Turbo,RM,PSS,SSS encoding LFSR(linear feedback shift registers)algorithms with high performance and flexibility.It can perform also general bit processing and m-sequence.Our design is based on proposed table conversion and a datapath for unified eight-mode encoding.Based on 28 nm digital CMOS technology,the total area is 0.177 mm2 and the clock frequency can be up to 1 GHz.The throughputs of Gold,Scrambling,CRC32,CRC24,CRC16,CRC8,CC,Turbo are 64 Gb/s,64 Gb/s,128 Gb/s,168 Gb/s,256 Gb/s,512 Gb/s,3×80 Gb/s,and 72 Gb/s,respectively.展开更多
Implementing video applications on emerging multi-core processors is a promising technique for personal, real-time multi-media applications. However, when porting the legacy parallel video encoders developed for clust...Implementing video applications on emerging multi-core processors is a promising technique for personal, real-time multi-media applications. However, when porting the legacy parallel video encoders developed for clusters to shared-memory multi-cores, the existing parallel algorithms result in workload imbalances on different cores and communication inefficiencies. This paper describes a strip-wise parallel scheme to balance workloads and a hybrid communication mechanism to reduce communication overhead. The implementation of the H.264 parallel encoder on an eight CPU Intel Xeon system achieves 5x to 6x speed-up over a single thread encoder and achieves a 29% performance improvement over the commonly used master-slave schemes on clusters. The paper also gives further analysis on scalability, parallel efficiency, workload balance, and communication overhead as the number of cores varies.展开更多
基金supported in part by the Project of the National Natural Science Foundation of China(Grant No.61961014)supported by the Hainan University project funding KYQD(ZR)1974。
文摘This paper proposes a flexible eight-mode high parallel Galois SIMD ASIP(Application Specific Instruction Set Processor).It supports parallel executions of Gold,Scrambling,CRC,CC,Turbo,RM,PSS,SSS encoding LFSR(linear feedback shift registers)algorithms with high performance and flexibility.It can perform also general bit processing and m-sequence.Our design is based on proposed table conversion and a datapath for unified eight-mode encoding.Based on 28 nm digital CMOS technology,the total area is 0.177 mm2 and the clock frequency can be up to 1 GHz.The throughputs of Gold,Scrambling,CRC32,CRC24,CRC16,CRC8,CC,Turbo are 64 Gb/s,64 Gb/s,128 Gb/s,168 Gb/s,256 Gb/s,512 Gb/s,3×80 Gb/s,and 72 Gb/s,respectively.
基金Supported by the National Natural Science Foundation of China(No. 60236020)
文摘Implementing video applications on emerging multi-core processors is a promising technique for personal, real-time multi-media applications. However, when porting the legacy parallel video encoders developed for clusters to shared-memory multi-cores, the existing parallel algorithms result in workload imbalances on different cores and communication inefficiencies. This paper describes a strip-wise parallel scheme to balance workloads and a hybrid communication mechanism to reduce communication overhead. The implementation of the H.264 parallel encoder on an eight CPU Intel Xeon system achieves 5x to 6x speed-up over a single thread encoder and achieves a 29% performance improvement over the commonly used master-slave schemes on clusters. The paper also gives further analysis on scalability, parallel efficiency, workload balance, and communication overhead as the number of cores varies.