摘要
针对多功能视频编码(Versatile Video Coding,VVC)标准中跨通道线性预测模型(Cross-Component Linear Model,CCLM)无法很好地拟合色度与亮度之间的非线性对应关系这一不足,提出了一种基于注意力机制卷积神经网络的VVC色度预测算法。该算法主要思想是在进行色度预测时,使用对应亮度块的信息与待预测色度块上方与左方的信息作为参考信息输入进卷积神经网络,利用注意力机制对参考信息中的亮度与色度间的内在联系进行分配权重后输入预测网络。实验结果表明,相较于VVC标准算法U分量和V分量的平均码率节省分别为0.64%和0.68%,有效提升了VVC编码性能。
For the problem that the Cross-Component Linear Model(CCLM)introduced by the Versatile Video Coding(VVC)standard for chroma prediction struggles to accurately capture the nonlinear relationship between chroma and luma,the authors propose a chroma prediction algorithm for VVC based on a convolutional neural network(CNN)with an attention mechanism.The main idea of this algorithm involves using information from the corresponding luminance block,as well as the information above and to the left of the chromaticity block,as reference data input into the CNN.Subsequently,the attention mechanism is employed to assign weights to the internal relationships between luma and chroma in the reference information.Finally,the resulting information is fed into the prediction network.Experimental results demonstrate that,compared with the VVC standard algorithm,the proposed method achieves average bit rate savings of 0.64%and 0.68%for the U and V,respectively,effectively enhancing the overall VVC coding performance.
作者
王昂
何小海
罗丹
熊淑华
陈洪刚
WANG Ang;HE Xiaohai;LUO Dan;XIONG Shuhua;CHEN Honggang(College of Electronics and Information Engineering,Sichuan University,Chengdu 610065,China)
出处
《电讯技术》
北大核心
2024年第11期1741-1749,共9页
Telecommunication Engineering
基金
国家自然科学基金资助项目(62271336,62211530110)
四川省国际科技创新合作项目(2024YFHZ0289)
TCL科技创新基金。
关键词
多功能视频编码
帧内预测
注意力机制
卷积神经网络
versatile video coding
intra-frame prediction
attention mechanism
convolutional neural network