基于多帧一致性修正的自监督孪生网络目标跟踪方法被引量：3

A Multi-Frame Consistency Correction Based Self-Supervised Siamese Network Method for Object Tracking

下载PDF

导出

摘要深度学习技术促使目标跟踪领域得到了飞速发展,但有限的标注数据限制了深度模型的高效训练.因此,自监督学习应用于目标跟踪领域来解决模型训练需要大量标注数据的问题.然而,现有基于自监督学习的跟踪器大多提取目标浅层信息,缺乏对目标关键特征的高效表达,且忽视了因目标遮挡等挑战导致的反向验证难度大的问题,致使跟踪精度下降.为解决上述问题,本文提出一种基于多帧一致性修正的自监督孪生网络跟踪方法,由前向多帧反序验证策略、混序修正模块和视觉特征增强模块三部分共同构成.首先,前向多帧反序验证策略从多条路径中自适应选择最优目标轨迹来构造循环一致性损失优化函数,面对目标遮挡、背景干扰、形变等挑战时能够合理规划路径.其次,针对多条路径对同一帧目标预测位置的不一致问题,提出混序修正模块来修正跟踪偏移,增强了前向跟踪时特征提取网络的鲁棒性.此外,视觉特征增强模块通过自适应加权融合目标的全局上下文信息与局部语义特征信息,增强了模型对目标自身特征的表达能力.最后,本文方法在OTB2013、OTB2015、TColor-128和VOT-2018四个公开数据集上进行了验证.实验结果表明:在光照、形变、背景干扰等复杂场景下,相比于现有21种主流跟踪算法,本文方法在四个数据集上的精确度平均提高了4.6%,比基于自/无监督学习的跟踪器平均提高了5.8%的精确度. Visual object tracking is an important yet challenging task in computer vision with a wide range of applications,such as video surveillance,robotics,action recognition,scene understanding,intelligent transportation,visual navigation,and human-machine interaction,etc.It aims to estimate the state of an arbitrary object in video frames,given the object bounding box in an initial frame.In recent years,deep learning technology has promoted the rapid development in the object tracking field,numerous visual tracking methods based on deep learning have made great progress,especially for Siamese trackers which aim to learn a decision making-based similarity evaluation.Nevertheless,the insufficient labeled data limits the efficient training of deep network model.Therefore,self-supervised learning strategy is applied to the object tracking to solve the problem of model training that requires a large number of labeled data.However,the existing self-supervised trackers mostly extract shallow information of the object and lack the efficient representation of key features of the object.In addition,they also ignore the difficulty of reverse verification caused by the challenges such as object occlusion,resulting in a decrease in tracking accuracy.In order to solve the above problems,a multi-frame consistency correction based self-supervised Siamese network tracking method(MCCSST)is proposed in this paper,which consists of a forward multi-frame reverse order verification strategy,a mixed order correction module and visual feature enhancement module.Firstly,the forward multi-frame reverse order verification strategy can adaptively select the optimal tracking trajectory from multiple paths to construct the cycle-consistency loss optimization function,so as to reasonably avoid the challenges of object occlusion,background clutter,deformation and so on.Secondly,for the problem of inconsistent object localization by multiple paths in the same frame,a mixed order correction module is proposed to correct the tracking drift and enhance the robustness of the object feature extraction,which utilizes temporal information of a video to better focus on the object’s own features during the forward tracking.In addition,the visual feature enhancement module,consisting of channel correlation branch,convolution block branch and spatial correlation branch,is utilized to enhance the object features representation ability by adaptively weighted fusing the global context information and local semantic feature information of the object.In order to improve channel category and spatial position information of the object,while suppressing irrelevant background information,we further develop an adaptive feature fusion scheme to fuse multi-dimensional feature maps of three branches.Based on Siamese network architecture,the Discriminant Correlation Filters Network with Vital Feature Enhancement(DCFNet-VFE)is designed as our baseline,and then the object location is achieved through the filter layer.Finally,the proposed method is verified on four public object tracking benchmark datasets:OTB2013,OTB2015,TColor-128 and VOT-2018.The experimental results show that,under the complex scenes(e.g.,illumination,deformation,background interference),the accuracy of the proposed method on the four benchmarks is improved by 4.6%on average over the compared twenty-one state-of-the-art trackers,which is an average of 5.8%higher than that of the self/unsupervised learning-based trackers.

作者程旭刘丽华王莹莹余梓彤赵国英 CHENG Xu;LIU Li-Hua;WANG Ying-Ying;YU Zi-Tong;ZHAO Guo-Ying(School of Computer Science,Nanjing University of Information Science and Technology,Nanjing 210044;Engineering Research Center of Digital Forensics,Nanjing University of Information Science and Technology,Nanjing 210044;Center for Machine Vision and Signal Analysis,University of Oulu,Oulu FI-90014,Finland)

机构地区南京信息工程大学计算机学院南京信息工程大学数字取证教育部工程研究中心奥卢大学机器视觉与信号分析研究中心

出处《计算机学报》 EI CAS CSCD 北大核心 2022年第12期2544-2560,共17页 Chinese Journal of Computers

基金国家自然科学基金(61802058,61911530397) 国家留学基金资助项目(201908320175) 中国博士后科学基金资助项目(2019M651650) 江苏省研究生科研与实践创新计划项目(KYCX22_1220)资助.

关键词视频监控目标跟踪自监督学习循环一致性损失视觉注意力机制 surveillance object tracking self-supervised learning cycle-consistency loss visual attention mechanism

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1黄凯奇,陈晓棠,康运锋,谭铁牛.智能视频监控技术综述[J].计算机学报,2015,38(6):1093-1118. 被引量：402

二级参考文献207

1王素玉,沈兰荪.智能视觉监控技术研究进展[J].中国图象图形学报,2007,12(9):1505-1514. 被引量：82
2Bouwmans T, El Baf F, Vachon B. Background modeling using mixture of Gaussians for foreground detection: A survey. Recent Patents on Computer Science, 2008, 1(3) 219-237.
3Wojek C, Dollar P, Schiele B, Perona P. Pedestrian detection: An evaluation o{ the state o{ the art. IEEE Pattern Analysis and Machine Intelligence, 2012, 34(4): 743-761.
4Yilmaz A, Javed O, Shah M. Object trackingt A survey. ACM Computing Surveys (CSUR), 2006, 38(4) 1-29.
5Wang X. Intelligent multi-camera video surveillance: A review. Pattern Recognition Letters, 2012, 34 (1) : 3-19.
6Wu Y, Lira J, Yang M H. Online object tracking: A bench- mark//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013 2411-2418.
7Andreopoulos A, Tsotsos J K. 50 years of object recognition: Directions forward. Computer Vision and Image Understanding, 2013, 117(8) 827-891.
8Zhang X, Yang Y H, Han Z, et al. Object class detection: A survey. Association for Computing Machinery Computing Surveys (CSUR), 2013, 46(1) : 1311-1325.
9Morris B T, Trivedi M M. A survey of vision-based trajectory learning and analysis for surveillance. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(8): 1114-1127.
10Aggarwal J K, Ryoo M S. Human activity analysis: A review. ACM Computing Surveys, 2011, 43(3): 16.

共引文献401

1刘海锋.煤矿智能化升级平台建设及运维保障研究[J].工矿自动化,2021,47(S01):32-35. 被引量：8
2梁平汉,郭宇辰,赵玉兰.地方政府建设智能视频监控系统的影响因素研究[J].复旦公共行政评论,2023(2):20-45.
3张兴国,周英迪,石新雨,罗霄月,顾杨旸.一种球机视频全景拼接及空间化方法[J].测绘科学,2022,47(5):203-211. 被引量：1
4汪辉,高尚兵,周君,周建,张莉雯.基于YOLOv3的多车道车流量统计及车辆跟踪方法[J].国外电子测量技术,2020,39(2):42-46. 被引量：15
5于长秋.论住房抵押贷款的证券化[J].金融理论与实践,2000(3):33-35. 被引量：1
6吴投文.论艺术家的孤独体验[J].湖北大学学报（哲学社会科学版）,2000,27(2):59-62. 被引量：7
7施巍松,孙辉,陈彦明.基于边缘计算的新型视频监控系统展望[J].自动化博览,2018,35(12):60-63. 被引量：5
8罗日成,方梦鸽,李志前,李稳,邹德华,李浙.基于传感器信息综合的带电作业安全防护系统[J].中国安全科学学报,2018,28(10):73-78. 被引量：4
9常玉兰,栗红梅,庄超明.浅析视频监控系统中设备的身份认证问题[J].国外电子测量技术,2018,37(11):5-9. 被引量：3
10向翼凌,何伟.面向校园安全的视频区域入侵检测算法[J].武汉工程大学学报,2019,41(1):93-97. 被引量：4

同被引文献18

1刘明华,汪传生,胡强,王传旭,崔雪红.多模型协作的分块目标跟踪[J].软件学报,2020,31(2):511-530. 被引量：5
2杜亚军,吴越,李显勇,陈晓亮,刘文君,范永全.基于结构平衡的社交网络舆情正向引导学习方法探讨[J].西华大学学报（自然科学版）,2019,38(2):1-11. 被引量：4
3范如国,王奕博,罗明,张应青,朱超平.基于SEIR的新冠肺炎传播模型及拐点预测分析[J].电子科技大学学报,2020,49(3):369-374. 被引量：143
4陈勇,刘曦,刘焕淋.基于特征通道和空间联合注意机制的遮挡行人检测方法[J].电子与信息学报,2020,42(6):1486-1493. 被引量：14
5王任华,王本璇,孔军,蒋晨琛.全局与局部分块联合的目标跟踪算法[J].南京理工大学学报,2020,44(4):462-470. 被引量：5
6周春月,颜巧.基于高分辨率孪生网络的单目标追踪算法[J].北京交通大学学报,2020,44(5):104-110. 被引量：3
7茅正冲,沈雪松.基于多特征融合的相关滤波跟踪算法[J].计算机与数字工程,2020,48(11):2645-2648. 被引量：2
8程语嫣,张九根,杨圣伟.多特征融合和尺度适应的相关滤波跟踪算法[J].计算机工程与设计,2020,41(12):3444-3450. 被引量：3
9蒲磊,冯新喜,侯志强,余旺盛.基于自适应背景选择和多检测区域的相关滤波算法[J].电子与信息学报,2020,42(12):3061-3067. 被引量：3
10谭建豪,殷旺,刘力铭,王耀南.引入全局上下文特征模块的DenseNet孪生网络目标跟踪[J].电子与信息学报,2021,43(1):179-186. 被引量：5

引证文献3

1张天晴,刘明华,何博,邵洪波.基于多注意力融合的抗遮挡目标跟踪[J].青岛科技大学学报（自然科学版）,2023,44(6):110-118.
2朱剑,李显勇,朱峰刚,高海平,杨学海,鲁晓艺.两类加权哑铃网络的一致性分析[J].西华大学学报（自然科学版）,2024,43(5):74-80.
3黄昱程,肖子旺,武丹凤,艾斯卡尔·艾木都拉.时空融合与判别力增强的孪生网络目标跟踪方法[J].智能系统学报,2024,19(5):1218-1227.

1卢军.中小学校现代点卯考勤系统的管理学批判[J].进展,2022,17(10):75-77.
2呼延烺,李映,蒋冬梅,张艳宁,周诠,魏佳圆,刘娟妮.在轨高效目标检测加速技术[J].宇航学报,2022,43(11):1544-1556.
3杨春发,刘胜超,王金帅,赵德萍,雷霞,崔云甫.中性粒细胞外捕网在胃癌病人术后高凝状态形成中的作用[J].腹部外科,2022,35(6):427-429. 被引量：1

计算机学报

2022年第12期

浏览历史

内容加载中请稍等...

基于多帧一致性修正的自监督孪生网络目标跟踪方法被引量：3

参考文献1

二级参考文献207

共引文献401

同被引文献18

引证文献3

相关作者

相关机构

相关主题

浏览历史

基于多帧一致性修正的自监督孪生网络目标跟踪方法 被引量：3

参考文献1

二级参考文献207

共引文献401

同被引文献18

引证文献3

相关作者

相关机构

相关主题

浏览历史

基于多帧一致性修正的自监督孪生网络目标跟踪方法被引量：3