Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to...Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to achieve flexible switching between different sizes of coding blocks.Aiming at this problem,a reconfigurable implementation of DBF is proposed.Based on the dynamic programmable reconfigurable video array processor(DPRAP)with context switch reconfiguration mechanism,the runtime flexible switching of two coding block sizes is realized.The experimental results show that the highest work-frequency reaches 151.4 MHz.Compared with the dedicated hardware architecture scheme,the resource consumption can be reduced by 28.1%while realizing the dynamic switching between algorithms of two coding block sizes.Compared with the results of HM16.0,by using a complete I-frame for testing,the average peak signal-to-noise ratio(PSNR)of the reconfigurable implementation proposed in this paper has increased by 3.0508 dB,the coding quality has improved to a certain extent.展开更多
基金Supported by the National Natural Science Foundation of China(No.61834005,61772417,61802304,61602377,61874087,61634004)the Shaanxi Province Key R&D Plan(No.2021GY-029,2021KW-16).
文摘Based on the flexible quadtree partition structure of coding tree units(CTUs),the deblocking filter(DBF)in high efficiency video coding(HEVC)consumes a lot of resources when implemen-ted by hardware.It is difficult to achieve flexible switching between different sizes of coding blocks.Aiming at this problem,a reconfigurable implementation of DBF is proposed.Based on the dynamic programmable reconfigurable video array processor(DPRAP)with context switch reconfiguration mechanism,the runtime flexible switching of two coding block sizes is realized.The experimental results show that the highest work-frequency reaches 151.4 MHz.Compared with the dedicated hardware architecture scheme,the resource consumption can be reduced by 28.1%while realizing the dynamic switching between algorithms of two coding block sizes.Compared with the results of HM16.0,by using a complete I-frame for testing,the average peak signal-to-noise ratio(PSNR)of the reconfigurable implementation proposed in this paper has increased by 3.0508 dB,the coding quality has improved to a certain extent.