As the newest standard,the High Efficiency Video Coding(HEVC)is specially designed to minimize the bitrate for video data transfer and to support High Definition(HD)and ULTRA HD video resolutions at the cost of increa...As the newest standard,the High Efficiency Video Coding(HEVC)is specially designed to minimize the bitrate for video data transfer and to support High Definition(HD)and ULTRA HD video resolutions at the cost of increasing computational complexity relative to earlier standards like the H.264.Therefore,real-time video decoding with HEVC decoder becomes a challenging task.However,the Dequantization and Inverse Transform(DE/IT)are one of the computationally intensive modules in the HEVC decoder which are used to reconstruct the residual block.Thus,in this paper,a unified hardware architecture is proposed to implement the HEVC DE/IT module for all Transform Unit(TU)block size,including 4×4,8×8,16×16 and 32×32.This architecture is designed using the High-Level Synthesis(HLS)and the Low-Level Synthesis(LLS)methods in order to compare and determine the best method to implement in real-time the DE/IT module.In fact,the C/C++programming language is used to generate an optimized hardware design for DE/IT module through the Xilinx Vivado HLS tool.On the other hand,the LLS hardware architecture is designed by the VHSIC Hardware Description language(VHDL)and using the pipeline technique to decrease the processing time.The experimental results on the Xilinx XC7Z020 FPGA show that the LLS design increases the throughput in term of frame rate by 80%relative to HLS design with a 4.4%increase in the number of Look-Up Tables(LUTs).Compared with existing related works in literature,the proposed architectures demonstrate significant advantages in hardware cost and performance improvement.展开更多
This paper addresses the issue of designing the detailed architectures of Field-Programmable Gate Arrays(FPGAs), which has a great impact on the overall performances of an FPGA in practice. Firstly, a novel FPGA archi...This paper addresses the issue of designing the detailed architectures of Field-Programmable Gate Arrays(FPGAs), which has a great impact on the overall performances of an FPGA in practice. Firstly, a novel FPGA architecture description model is proposed based on an easy-to-use file format known as YAML. This format permits the description of any detailed architecture of hard blocks and channels. Then a general algorithm of building FPGA resource graph is presented. The proposed model is scalable and capable of dealing with detailed architecture design and can be used in FPGA architecture evaluation system which is developed to enable detailed architecture design. Experimental results show that a maximum of 16.36% reduction in total wirelength and a maximum of 9.34% reduction in router effort can be obtained by making very little changes to detailed architectures, which verifies the necessity and effectiveness of the proposed model.展开更多
In this paper, we propose novel hardware architecture for intra 16 × 16 module for the macroblock engine of a new video coding standard H.264. To reduce the cycle of intra prediction 16 × 16, transform/quant...In this paper, we propose novel hardware architecture for intra 16 × 16 module for the macroblock engine of a new video coding standard H.264. To reduce the cycle of intra prediction 16 × 16, transform/quantization, and inverse quantization/inverse transform of H.264, an advanced method for different operation is proposed. This architecture can process one macroblock in 208 cycles for all cases of macroblock type by processing 4 × 4 Hadamard transform and quantization during 16 × 16 prediction. This module was designed using VHDL Hardware Description Language (HDL) and works with a 160 MHz frequency using ALTERA NIOS-II development board with Stratix II EP2S60F1020C3 FPGA. The system also includes software running on an NIOS-II processor in order to implementing the pre-processing and the post-processing functions. Finally, the execution time of our HW solution is decreased by 26% when compared with the previous work.展开更多
Embedded systems used in real-time applications require low power, less area and high computation speed. For digital signal processing, image processing and communication applications, data are often received at a con...Embedded systems used in real-time applications require low power, less area and high computation speed. For digital signal processing, image processing and communication applications, data are often received at a continuously high rate. The type of necessary arithmetic functions and matrix operations may vary greatly among different applications. The RTL-based design and verification of one or more of these functions could be time-consuming. Some High Level Synthesis tools reduce this design and verification time but may not be optimal or suitable for low power applications. The design tool proposed in this paper can improve the design time and reduce the verification process. The design tool offers a fast design and verification platform for important matrix operations. These operations range from simple addition to more complex matrix operations such as LU and QR factorizations. The proposed platform can improve design time by reducing verification cycle. This tool generates Verilog code and its testbench that can be realized in FPGA and VLSI systems. The designed system uses MATLAB-based verification and reporting.展开更多
With the development of integrated circuit, the content of digital circuit experiment course is constantly updated. In order to keep up with the development trend of the Times and make students’ professional knowledg...With the development of integrated circuit, the content of digital circuit experiment course is constantly updated. In order to keep up with the development trend of the Times and make students’ professional knowledge meet the needs of the industry, the school adopts the FPGA experimental platform to carry out teaching reform from the two aspects of platform and experiment, and carry out reasonable experimental planning to enrich the experimental content. In this paper, the traditional knowledge points of logic algebra, trigger, timer, counter, decoder and digital tube are organically combined, and the digital clock system is designed and realized. The practice shows that the combination of modern design method and traditional digital circuit teaching method can play a good teaching effect. In this way, students can also fully learn, understand and skillfully use the new technology in the experiment, and in the process of building a comprehensive understanding of digital circuits.展开更多
为提高电信网设备应对异常信令访问的检测能力,需对64K信令进行分析并处理。为了提高解析效率并满足近年来相关产品对自主可控越来越高的要求,设计了一种基于国产现场可编程门阵列(Field Programmable Gate Array, FPGA)的信令解析方案...为提高电信网设备应对异常信令访问的检测能力,需对64K信令进行分析并处理。为了提高解析效率并满足近年来相关产品对自主可控越来越高的要求,设计了一种基于国产现场可编程门阵列(Field Programmable Gate Array, FPGA)的信令解析方案,给出了方案的总体设计思路,并对FPGA实现的功能模块进行详细说明。对系统进行设计时,采用模块化参数化方法以及在关键环节添加状态参数,提高了可扩展性并可以对模块内部运行状态进行监控,最终实现了对信令高效且灵活的解析,主要器件等均为国产。经过测试,可以实现STM-1(STM-Synchronous Transfer Module-1)数据的接入、串并转换、HDLC(High-level Data Link Control)解帧等功能,完成32路64K信令的并发处理,模块运行状态可查可看,达到了预期的效果。以STM-1为例,基于现有功能的模块化设计,可以平滑地扩展到STM-4、STM-16的应用。展开更多
基金This work was funded by the Deanship of Scientific Research at Jouf University(Kingdom of Saudi Arabia)under grant No.DSR-2021-02-0391。
文摘As the newest standard,the High Efficiency Video Coding(HEVC)is specially designed to minimize the bitrate for video data transfer and to support High Definition(HD)and ULTRA HD video resolutions at the cost of increasing computational complexity relative to earlier standards like the H.264.Therefore,real-time video decoding with HEVC decoder becomes a challenging task.However,the Dequantization and Inverse Transform(DE/IT)are one of the computationally intensive modules in the HEVC decoder which are used to reconstruct the residual block.Thus,in this paper,a unified hardware architecture is proposed to implement the HEVC DE/IT module for all Transform Unit(TU)block size,including 4×4,8×8,16×16 and 32×32.This architecture is designed using the High-Level Synthesis(HLS)and the Low-Level Synthesis(LLS)methods in order to compare and determine the best method to implement in real-time the DE/IT module.In fact,the C/C++programming language is used to generate an optimized hardware design for DE/IT module through the Xilinx Vivado HLS tool.On the other hand,the LLS hardware architecture is designed by the VHSIC Hardware Description language(VHDL)and using the pipeline technique to decrease the processing time.The experimental results on the Xilinx XC7Z020 FPGA show that the LLS design increases the throughput in term of frame rate by 80%relative to HLS design with a 4.4%increase in the number of Look-Up Tables(LUTs).Compared with existing related works in literature,the proposed architectures demonstrate significant advantages in hardware cost and performance improvement.
基金Supported by National High Technology Research and Develop Program of China(No.2012AA012301)National Science and Technology Major Project of China(No.2013ZX03006004)
文摘This paper addresses the issue of designing the detailed architectures of Field-Programmable Gate Arrays(FPGAs), which has a great impact on the overall performances of an FPGA in practice. Firstly, a novel FPGA architecture description model is proposed based on an easy-to-use file format known as YAML. This format permits the description of any detailed architecture of hard blocks and channels. Then a general algorithm of building FPGA resource graph is presented. The proposed model is scalable and capable of dealing with detailed architecture design and can be used in FPGA architecture evaluation system which is developed to enable detailed architecture design. Experimental results show that a maximum of 16.36% reduction in total wirelength and a maximum of 9.34% reduction in router effort can be obtained by making very little changes to detailed architectures, which verifies the necessity and effectiveness of the proposed model.
文摘In this paper, we propose novel hardware architecture for intra 16 × 16 module for the macroblock engine of a new video coding standard H.264. To reduce the cycle of intra prediction 16 × 16, transform/quantization, and inverse quantization/inverse transform of H.264, an advanced method for different operation is proposed. This architecture can process one macroblock in 208 cycles for all cases of macroblock type by processing 4 × 4 Hadamard transform and quantization during 16 × 16 prediction. This module was designed using VHDL Hardware Description Language (HDL) and works with a 160 MHz frequency using ALTERA NIOS-II development board with Stratix II EP2S60F1020C3 FPGA. The system also includes software running on an NIOS-II processor in order to implementing the pre-processing and the post-processing functions. Finally, the execution time of our HW solution is decreased by 26% when compared with the previous work.
文摘Embedded systems used in real-time applications require low power, less area and high computation speed. For digital signal processing, image processing and communication applications, data are often received at a continuously high rate. The type of necessary arithmetic functions and matrix operations may vary greatly among different applications. The RTL-based design and verification of one or more of these functions could be time-consuming. Some High Level Synthesis tools reduce this design and verification time but may not be optimal or suitable for low power applications. The design tool proposed in this paper can improve the design time and reduce the verification process. The design tool offers a fast design and verification platform for important matrix operations. These operations range from simple addition to more complex matrix operations such as LU and QR factorizations. The proposed platform can improve design time by reducing verification cycle. This tool generates Verilog code and its testbench that can be realized in FPGA and VLSI systems. The designed system uses MATLAB-based verification and reporting.
文摘With the development of integrated circuit, the content of digital circuit experiment course is constantly updated. In order to keep up with the development trend of the Times and make students’ professional knowledge meet the needs of the industry, the school adopts the FPGA experimental platform to carry out teaching reform from the two aspects of platform and experiment, and carry out reasonable experimental planning to enrich the experimental content. In this paper, the traditional knowledge points of logic algebra, trigger, timer, counter, decoder and digital tube are organically combined, and the digital clock system is designed and realized. The practice shows that the combination of modern design method and traditional digital circuit teaching method can play a good teaching effect. In this way, students can also fully learn, understand and skillfully use the new technology in the experiment, and in the process of building a comprehensive understanding of digital circuits.
文摘为提高电信网设备应对异常信令访问的检测能力,需对64K信令进行分析并处理。为了提高解析效率并满足近年来相关产品对自主可控越来越高的要求,设计了一种基于国产现场可编程门阵列(Field Programmable Gate Array, FPGA)的信令解析方案,给出了方案的总体设计思路,并对FPGA实现的功能模块进行详细说明。对系统进行设计时,采用模块化参数化方法以及在关键环节添加状态参数,提高了可扩展性并可以对模块内部运行状态进行监控,最终实现了对信令高效且灵活的解析,主要器件等均为国产。经过测试,可以实现STM-1(STM-Synchronous Transfer Module-1)数据的接入、串并转换、HDLC(High-level Data Link Control)解帧等功能,完成32路64K信令的并发处理,模块运行状态可查可看,达到了预期的效果。以STM-1为例,基于现有功能的模块化设计,可以平滑地扩展到STM-4、STM-16的应用。