摘要
基于视频的行人重识别是将一段视频轨迹与剪辑后的视频帧进行匹配,从而实现在不同的摄像头下识别同一行人。但由于现实场景的复杂性,采集到的行人轨迹会存在严重的外观丢失和错位,传统的三维卷积将不再适用于视频行人重识别任务。针对这一问题,提出三维特征分块重构模型,利用第一张特征图在水平分块的级别上对后续特征图进行对齐。在保证特征质量的前提下充分挖掘轨迹的时间信息,在特征重构模型后加入三维卷积核,并且将它与现有的三维卷积网络相结合。此外,还引入一种由粗到细的特征分块重构网络,不仅能使模型在两种不同尺度的空间维度上进行特征重构,还能进一步减少计算开销。实验表明,由粗到细的特征分块重构网络在MARS和DukeMTMC⁃VideoReID数据集上取得了良好的结果。
Video⁃based person re⁃identification(Re⁃ID)is to match a video track with a clipped video frame,so as to recognize the same pedestrian under different cameras.However,due to the complexity of the real scene,the collected pedestrian trajectories will have serious appearance loss and dislocation,and the traditional 3D convolution will no longer be suitable for the video pedestrian re⁃identification task.Therefore,a 3D feature block reconstruction model(3D⁃FBRM)is proposed,which uses the first feature map to align subsequent feature maps at the level of horizontal blocks.In order to fully mine the time information of the trajectory under the premise of ensuring the quality of the features,a 3D convolution kernel is added after the FBRM,and it is combined with the existing 3D ConvNets.In addition,a coarse⁃to⁃fine feature block reconstruction network(CF⁃FBRNet)is introduced,which not only enables the model to perform feature reconstruction in two different scales of spatial dimensions,but also further reduces computational overhead.Experiments show that the CF⁃FBRNet achieves state⁃of⁃the⁃art results on the MARS and DukeMTMC⁃VideoReID datasets.
作者
王锦华
周非
白梦林
舒浩峰
WANG Jinhua;ZHOU Fei;BAI Menglin;SHU Haofeng(School of Communication and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China)
出处
《数据采集与处理》
CSCD
北大核心
2023年第3期565-573,共9页
Journal of Data Acquisition and Processing
关键词
视频行人重识别
特征分块
特征重构
三维卷积
由粗到细的特征分块重构网络
video⁃based person re⁃identification
feature block
feature reconstruction
3D convolution
coarse⁃to⁃fine feature block reconstruction network(CF⁃FBRNet)