Improvement of digital FIR filter is vital in the field of Digital Signal Processing in order to reduce the area, delay and power. Multiplication and Accumulation (MAC) unit of Finite Impulse Response (FIR) filte...Improvement of digital FIR filter is vital in the field of Digital Signal Processing in order to reduce the area, delay and power. Multiplication and Accumulation (MAC) unit of Finite Impulse Response (FIR) filter has been designed using efficient multiplier and adder circuits for optimized APT (Area,Power and Timing) product. In this paper, the design of direct form FIR filter with efficient MAC unit has been presented. Initially, full adder and half adder structures are shrunk down by reducing number of gates. These compact full adder and half adder structures are incorporated into Wallace Multiplier and Improved Carry-Save Adder. The proposed 16-bit Carry-Save Adder has been improved by splitting into four parallel phases. Consequently the delay of enhanced Carry- Save Adder is reduced. Generation of carry output is performed using number of OR gates in a sequential manner. All these enhanced architectures are incorporated into the Digital FIR Filter to reduce the area, delay and power utilization.展开更多
A high-performance, low cost inverse integer transform architecture for advanced video standard (AVS) video coding standard was presented. An 8 × 8 inverse integer transform is required in AVS video system whic...A high-performance, low cost inverse integer transform architecture for advanced video standard (AVS) video coding standard was presented. An 8 × 8 inverse integer transform is required in AVS video system which is compute-intensive. A hardware transform is inevitable to compute the transform for the real-time application. Compared with the 4 × 4 transform for H.264/AVC, the 8 × 8 integer transform is much more complex and the coefficient in the inverse transform matrix Ts is not inerratic as that in H.264/AVC. Dividing the Ts into matrix Ss and Rs, the proposed architecture is implemented with the adders and the specific CSA-trees instead of multipliers, which are area and time consuming. The architecture obtains the data processing rate up to 8 pixels per-cycle at a low cost of area. Synthesized to TSMC 0.18 μm COMS process, the architecture attains the operating frequency of 300 MHz at cost of 34 252 gates with a 2-stage pipeline scheme. A reusable scheme is also introduced for the area optimization, which results in the operating frequency of 143 MHz at cost of only 19 758 gates.展开更多
文摘Improvement of digital FIR filter is vital in the field of Digital Signal Processing in order to reduce the area, delay and power. Multiplication and Accumulation (MAC) unit of Finite Impulse Response (FIR) filter has been designed using efficient multiplier and adder circuits for optimized APT (Area,Power and Timing) product. In this paper, the design of direct form FIR filter with efficient MAC unit has been presented. Initially, full adder and half adder structures are shrunk down by reducing number of gates. These compact full adder and half adder structures are incorporated into Wallace Multiplier and Improved Carry-Save Adder. The proposed 16-bit Carry-Save Adder has been improved by splitting into four parallel phases. Consequently the delay of enhanced Carry- Save Adder is reduced. Generation of carry output is performed using number of OR gates in a sequential manner. All these enhanced architectures are incorporated into the Digital FIR Filter to reduce the area, delay and power utilization.
文摘A high-performance, low cost inverse integer transform architecture for advanced video standard (AVS) video coding standard was presented. An 8 × 8 inverse integer transform is required in AVS video system which is compute-intensive. A hardware transform is inevitable to compute the transform for the real-time application. Compared with the 4 × 4 transform for H.264/AVC, the 8 × 8 integer transform is much more complex and the coefficient in the inverse transform matrix Ts is not inerratic as that in H.264/AVC. Dividing the Ts into matrix Ss and Rs, the proposed architecture is implemented with the adders and the specific CSA-trees instead of multipliers, which are area and time consuming. The architecture obtains the data processing rate up to 8 pixels per-cycle at a low cost of area. Synthesized to TSMC 0.18 μm COMS process, the architecture attains the operating frequency of 300 MHz at cost of 34 252 gates with a 2-stage pipeline scheme. A reusable scheme is also introduced for the area optimization, which results in the operating frequency of 143 MHz at cost of only 19 758 gates.