Digital low-density parity-check(LDPC) decoders can hardly meet the power-limits brought by the new application scenarios. The analog LDPC decoder, which is an application of the analog computation technology, is cons...Digital low-density parity-check(LDPC) decoders can hardly meet the power-limits brought by the new application scenarios. The analog LDPC decoder, which is an application of the analog computation technology, is considered to have the potential to address this issue to some extent. However, due to the lack of automation tools and analog stopping criteria, the analog LDPC decoders suffer from costly handcraft design and additional decoding delay, and are not feasible to practical applications. To address these issues, a decoder architecture using reusable building blocks is designed to lower the handcraft design, and a probability stopping criterion that is specially designed for analog decoder is further planned and implemented to reduce the decoding delay. Then, a(480,240) CMOS analog LDPC decoder is designed and fabricated in a 0.35-μm CMOS technology. Experimental results show that the decoder prototype can achieve 50 Mbps throughput when the power consumption is about 86.3m W, and the decoding delay can be reduced by at most 93% compared with using the preset maximum decoding delay in existing works.展开更多
In this paper, a joint precoding and decoding design scheme is proposed for two-way Multiple-Input Multiple-Output (MIMO) multiple-relay system. The precoding and decoding matrices are jointly optimized based on Minim...In this paper, a joint precoding and decoding design scheme is proposed for two-way Multiple-Input Multiple-Output (MIMO) multiple-relay system. The precoding and decoding matrices are jointly optimized based on Minimum Mean-Square-Error (MMSE) criteria under transmit power constraints. The optimization problem is solved by using a convergent iterative algorithm which in-cludes four sub-problems. It is shown that due to the difficulty of the block diagonal nature of the relay precoding matrix, sub-problem two cannot be solved with existing methods. It is then solved by converting sub-problem two into a convex optimization problem and a simplified method is proposed to reduce the computational complexity. Simulation results show that the proposed scheme can achieve lower Bit Error Rate (BER) and larger sum rate than other schemes. Furthermore, the BER and the sum rate performance can be improved by increasing the number of antennas for the same number of relays or increasing the number of relays for the same number of antennas.展开更多
The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Co...The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well.展开更多
The design of a high-speed decoder using traditional partly parallel architecture for Non-Quasi-Cyclic(NQC) Low-Density Parity-Check(LDPC) codes is a challenging problem due to its high memory-block cost and low h...The design of a high-speed decoder using traditional partly parallel architecture for Non-Quasi-Cyclic(NQC) Low-Density Parity-Check(LDPC) codes is a challenging problem due to its high memory-block cost and low hardware utilization efficiency. In this paper, we present efficient hardware implementation schemes for NQCLDPC codes. First, we propose an implementation-oriented construction scheme for NQC-LDPC codes to avoid memory-access conflict in the partly parallel decoder. Then, we propose a Modified Overlapped Message-Passing(MOMP) algorithm for the hardware implementation of NQC-LDPC codes. This algorithm doubles the hardware utilization efficiency and supports a higher degree of parallelism than that used in the Overlapped Message Passing(OMP) technique proposed in previous works. We also present single-core and multi-core decoder architectures in the proposed MOMP algorithm to reduce memory cost and improve circuit efficiency. Moreover, we introduce a technique called the cycle bus to further reduce the number of block RAMs in multi-core decoders. Using numerical examples, we show that, for a rate-2/3, length-15360 NQC-LDPC code with 8.43-d B coding gain for Binary PhaseShift Keying(BPSK) in an Additive White Gaussian Noise(AWGN) channel, the decoder with the proposed scheme achieves a 23.8%–52.6% reduction in logic utilization per Mbps and a 29.0%–90.0% reduction in message-memory bits per Mbps.展开更多
基金supported in part by the National Natural Science Foundation of China(No.61601027)the Opening Fund of the Space Objective Measure Key Laboratory(No.2016011)
文摘Digital low-density parity-check(LDPC) decoders can hardly meet the power-limits brought by the new application scenarios. The analog LDPC decoder, which is an application of the analog computation technology, is considered to have the potential to address this issue to some extent. However, due to the lack of automation tools and analog stopping criteria, the analog LDPC decoders suffer from costly handcraft design and additional decoding delay, and are not feasible to practical applications. To address these issues, a decoder architecture using reusable building blocks is designed to lower the handcraft design, and a probability stopping criterion that is specially designed for analog decoder is further planned and implemented to reduce the decoding delay. Then, a(480,240) CMOS analog LDPC decoder is designed and fabricated in a 0.35-μm CMOS technology. Experimental results show that the decoder prototype can achieve 50 Mbps throughput when the power consumption is about 86.3m W, and the decoding delay can be reduced by at most 93% compared with using the preset maximum decoding delay in existing works.
基金Supported by the National Science and Technology Specific Project (2011ZX03005-004-003)the National Natural Science Foundation of China (No. 61071090, 61171093)+2 种基金973 Project of Jiangsu Province (BK2011027)the Project 11KJA510001 and PAPDthe Jiangsu Postgraduate Research Project (CXZZ11_0384)
文摘In this paper, a joint precoding and decoding design scheme is proposed for two-way Multiple-Input Multiple-Output (MIMO) multiple-relay system. The precoding and decoding matrices are jointly optimized based on Minimum Mean-Square-Error (MMSE) criteria under transmit power constraints. The optimization problem is solved by using a convergent iterative algorithm which in-cludes four sub-problems. It is shown that due to the difficulty of the block diagonal nature of the relay precoding matrix, sub-problem two cannot be solved with existing methods. It is then solved by converting sub-problem two into a convex optimization problem and a simplified method is proposed to reduce the computational complexity. Simulation results show that the proposed scheme can achieve lower Bit Error Rate (BER) and larger sum rate than other schemes. Furthermore, the BER and the sum rate performance can be improved by increasing the number of antennas for the same number of relays or increasing the number of relays for the same number of antennas.
文摘The Long Term Evolution (LTE) system imposes high requirements for dispatching delay.Moreover,very large air interface rate of LTE requires good processing capability for the devices processing the baseband signals.Consequently,the single-core processor cannot meet the requirements of LTE system.This paper analyzes how to use multi-core processors to achieve parallel processing of uplink demodulation and decoding in LTE systems and designs an approach to parallel processing.The test results prove that this approach works quite well.
基金supported in part by the National Natural Science Foundation of China (Nos. 61101072 and 61132002)the new strategic industries development projects of Shenzhen city (No. ZDSY20120616141333842)Tsinghua University Initiative Scientific Research Program (No. 2012Z10132)
文摘The design of a high-speed decoder using traditional partly parallel architecture for Non-Quasi-Cyclic(NQC) Low-Density Parity-Check(LDPC) codes is a challenging problem due to its high memory-block cost and low hardware utilization efficiency. In this paper, we present efficient hardware implementation schemes for NQCLDPC codes. First, we propose an implementation-oriented construction scheme for NQC-LDPC codes to avoid memory-access conflict in the partly parallel decoder. Then, we propose a Modified Overlapped Message-Passing(MOMP) algorithm for the hardware implementation of NQC-LDPC codes. This algorithm doubles the hardware utilization efficiency and supports a higher degree of parallelism than that used in the Overlapped Message Passing(OMP) technique proposed in previous works. We also present single-core and multi-core decoder architectures in the proposed MOMP algorithm to reduce memory cost and improve circuit efficiency. Moreover, we introduce a technique called the cycle bus to further reduce the number of block RAMs in multi-core decoders. Using numerical examples, we show that, for a rate-2/3, length-15360 NQC-LDPC code with 8.43-d B coding gain for Binary PhaseShift Keying(BPSK) in an Additive White Gaussian Noise(AWGN) channel, the decoder with the proposed scheme achieves a 23.8%–52.6% reduction in logic utilization per Mbps and a 29.0%–90.0% reduction in message-memory bits per Mbps.