摘要
近年来,面向机器视觉视频的研究和应用越来越广泛,这对此类视频的存储和传输都提出了巨大的挑战。视频编码标准如多功能视频编码(Versatile Video Coding,VVC)能实现高效的全分辨率压缩与重建,但是对机器视觉任务而言,这种压缩方法是有冗余的。因此,提出了一种在VVC编码过程中结合显著性检测的视频编码方法用于机器任务,用实例分割网络掩膜基于区域的卷积神经网络(Mask Region-based Convolutional Neural Network,Mask R-CNN)获得包含对象的二进制掩膜,并依此判定是否为感兴趣区域,指导VVC编码过程中编码树单元(Coding Tree Unit,CTU)的量化参数的偏移。实验证明,与VVC基线方法相比,所提方法可以在相似的检测精度下节省一定的比特率。
In recent years,video coding for machine is increasingly studied and applied in a wide range,which poses great challenges to both storage and transmission of such videos.Video coding standards such as VVC(Versatile Video Coding)enable efficient full resolution compression and reconstruction,but this compression method is redundant for machine vision tasks.Therefore,this paper proposes a video coding method that combines saliency detection in VVC coding for machine tasks,which uses the instance segmentation network Mask R-CNN(Mask Region-based Convolutional Neural Network)to obtain a binary mask containing the object,and uses this to determine whether it is a region of interest or not,thus guides the offset of the quantization parameters of the CTU(Coding Tree Unit)in the VVC coding process.Experimental results demonstrate that the proposed method can save a certain bit rate with similar detection accuracy compared to the VVC baseline method.
作者
李鸿耀
何小海
陈洪刚
魏海涛
熊淑华
LI Hongyao;HE Xiaohai;CHEN Honggang;WEI Haitao;XIONG Shuhua(College of Electronic Information,Sichuan University,Chengdu Sichuan 610065,China)
出处
《通信技术》
2024年第5期436-443,共8页
Communications Technology
基金
国家自然科学基金(62271336,62211530110)
TCL科技创新基金
四川省科技项目基金(24GJHZ0381)。