单目标跟踪中的视觉智能评估技术综述

Visual intelligence evaluation techniques for single object tracking:a survey

导出

摘要单目标跟踪任务旨在对人类动态视觉系统进行建模,让机器在复杂环境中具备类人的运动目标跟踪能力,并已广泛应用于无人驾驶、视频监控、机器人视觉等领域。研究者从算法设计的角度开展了大量工作,并在代表性数据集中表现出良好性能。然而,在面临如目标形变、快速运动、光照变化等挑战因素时,现有算法的跟踪效果和人类预期相比还存在着较大差距,揭示了当前的评测技术发展仍存在滞后性和局限性。综上,区别于以算法设计为核心的传统综述思路,本文依托单目标跟踪任务、从视觉智能评估技术出发,对评测流程中涉及的各个关键性环节(评测任务、评测环境、待测对象和评估机制)进行系统梳理。首先,对单目标跟踪任务的发展历程和挑战因素进行介绍,并详细对比了评估所需的评测环境(数据集、竞赛等)。其次,对单目标跟踪待测对象进行介绍,不仅包含以相关滤波和孪生神经网络为代表的跟踪算法,同时也涉及跨学科领域开展的人类视觉跟踪实验。最后,从“机机对抗”和“人机对抗”两个角度对单目标跟踪评估机制进行回顾,并对当前待测对象的目标跟踪能力进行分析和总结。在此基础上,对单目标跟踪智能评估的发展趋势进行总结和展望,进一步分析未来研究中存在的挑战因素,并探讨了下一步可能的研究方向。 Single object tracking(SOT)task,which aims to model the human dynamic vision system and accomplish human-like object tracking ability in complex environments,has been widely used in various real-world applications like self-driving,video surveillance,and robot vision.Over the past decade,the development in deep learning has encouraged many research groups to work on designing different tracking frameworks like correlation filter(CF)and Siamese neural networks(SNNs),which facilitate the progress of SOT research.However,many factors(e.g.,target deformation,fast motion,and illumination changes)in natural application scenes still challenge the SOT trackers.Thus,algorithms with novel architectures have been proposed for robust tracking and to achieve better performance in representative experimental environments.However,several poor cases in natural application environments reveal a large gap between the performance of state-of-the-art trackers and human expectations,which motivates us to pay close attention to the evaluation aspects.Therefore,instead of the traditional reviews that mainly concentrate on algorithm design,this study systematically reviews the visual intelligence evaluation techniques for SOT,including four key aspects:the task definition,evaluation environments,task executors,and evaluation mechanisms.First,we present the development direction of task definition,which includes the original short-term tracking,long-term tracking,and the recently proposed global instance tracking.With the evolution of the SOT definition,research has shown a progress from perceptual to cognitive intelligence.We also summarize challenging factors in the SOT task to help readers understand the research bottlenecks in actual applications.Second,we compare the representative experimental environments in SOT evaluation.Unlike existing reviews that mainly introduce datasets based on chronological order,this study divides the environments into three categories(i.e.,general datasets,dedicated datasets,and competition datasets)and introduces them separately.Third,we introduce the executors of SOT tasks,which not only include tracking algorithms represented by traditional trackers,CF-based trackers,SNN-based trackers,and Transformer-based trackers but also contain human visual tracking experiments conducted in interdisciplinary fields.To our knowledge,none of the existing SOT reviews have included related works on human dynamic visual ability.Therefore,introducing interdisciplinary works can also support the visual intelligence evaluation by comparing machines with humans and better reveal the intelligence degree of existing algorithm modeling methods.Fourth,we review the evaluation mechanism and metrics,which encompass traditional machine–machine and novel human–machine comparisons,and analyze the target tracking capability of various task executors.We also provide an overview of the human–machine comparison named visual Turing test,including its application in many vision tasks(e.g.,image comprehension,game navigation,image classification,and image recognition).Especially,we hope that this study can help researchers focus on this novel evaluation technique,better understand the capability bottlenecks,further explore the gaps between existing methods and humans,and finally achieve the goal of algorithmic intelligence.Finally,we indicate the evolution trend of visual intelligence evaluation techniques:1)designing more human-like task definitions,2)constructing more comprehensive and realistic evaluation environments,3)including human subjects as task executors,and 4)using human abilities as a baseline to evaluate machine intelligence.In conclusion,this study summarizes the evolution trend of visual intelligence evaluation techniques for SOT task,further analyzes the existing challenge factors,and discusses the possible future research directions.

作者胡世宇赵鑫黄凯奇 Hu Shiyu;Zhao Xin;Huang Kaiqi(School of Artificial Intelligence,University of Chinese Academy of Sciences,Beijing 100049,China;Center for Research on Intelligent System and Engineering,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China;Center for Excellence in Brain Science and Intelligence Technology,Chinese Academy of Sciences,Shanghai 200031,China)

机构地区中国科学院大学人工智能学院中国科学院自动化研究所智能系统与工程研究中心中国科学院脑科学与智能技术卓越创新中心

出处《中国图象图形学报》 CSCD 北大核心 2024年第8期2269-2302,共34页 Journal of Image and Graphics

基金科技创新2030——“新一代人工智能”重大项目(2022ZD0116403) 国家自然科学基金项目(61721004) 中国科学院战略性先导科技专项(XDA27000000)。

关键词智能评估技术竞赛和数据集视觉跟踪能力单目标跟踪(SOT) 目标跟踪算法 intelligence evaluation technique competitions and datasets visual tracking ability single object tracking(SOT) object tracking algorithms

分类号 TP389.1 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献7

1卢湖川,李佩霞,王栋.目标跟踪算法综述[J].模式识别与人工智能,2018,31(1):61-76. 被引量：163
2李玺,查宇飞,张天柱,崔振,左旺孟,侯志强,卢湖川,王菡子.深度学习的目标跟踪算法综述[J].中国图象图形学报,2019,24(12):2057-2080. 被引量：108
3韩瑞泽,冯伟,郭青,胡清华.视频单目标跟踪研究进展综述[J].计算机学报,2022,45(9):1877-1907. 被引量：32
4李成龙,鹿安东,刘磊,汤进.多模态视觉跟踪方法综述[J].中国图象图形学报,2023,28(1):37-56. 被引量：4
5黄凯奇,赵鑫,李乔哲,胡世宇.视觉图灵:从人机对抗看计算机视觉下一步发展[J].图学学报,2021,42(3):339-348. 被引量：6
6黄凯奇,兴军亮,张俊格,倪晚成,徐博.人机对抗智能技术[J].中国科学：信息科学,2020,50(4):540-550. 被引量：28
7黄凯奇,陈晓棠,康运锋,谭铁牛.智能视频监控技术综述[J].计算机学报,2015,38(6):1093-1118. 被引量：400

二级参考文献224

1曾鹏,吴玲达,魏迎梅.战术计划识别模型的分析、描述与设计[J].计算机与数字工程,2006,34(9):1-4. 被引量：5
2王素玉,沈兰荪.智能视觉监控技术研究进展[J].中国图象图形学报,2007,12(9):1505-1514. 被引量：82
3Bouwmans T, El Baf F, Vachon B. Background modeling using mixture of Gaussians for foreground detection: A survey. Recent Patents on Computer Science, 2008, 1(3) 219-237.
4Wojek C, Dollar P, Schiele B, Perona P. Pedestrian detection: An evaluation o{ the state o{ the art. IEEE Pattern Analysis and Machine Intelligence, 2012, 34(4): 743-761.
5Yilmaz A, Javed O, Shah M. Object trackingt A survey. ACM Computing Surveys (CSUR), 2006, 38(4) 1-29.
6Wang X. Intelligent multi-camera video surveillance: A review. Pattern Recognition Letters, 2012, 34 (1) : 3-19.
7Wu Y, Lira J, Yang M H. Online object tracking: A bench- mark//Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition. Portland, USA, 2013 2411-2418.
8Andreopoulos A, Tsotsos J K. 50 years of object recognition: Directions forward. Computer Vision and Image Understanding, 2013, 117(8) 827-891.
9Zhang X, Yang Y H, Han Z, et al. Object class detection: A survey. Association for Computing Machinery Computing Surveys (CSUR), 2013, 46(1) : 1311-1325.
10Morris B T, Trivedi M M. A survey of vision-based trajectory learning and analysis for surveillance. IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(8): 1114-1127.

共引文献706

1陈仁祥,何家乐,杨黎霞,余腾伟,张霞.基于加权在线样本更新的目标长时跟踪方法[J].仪器仪表学报,2023,44(6):66-73. 被引量：2
2刘海锋.煤矿智能化升级平台建设及运维保障研究[J].工矿自动化,2021,47(S01):32-35. 被引量：8
3付兴武,杨哲,姜文涛.因式分解卷积运算的多尺度目标跟踪算法[J].辽宁工程技术大学学报（自然科学版）,2019,38(5):463-471.
4梁平汉,郭宇辰,赵玉兰.地方政府建设智能视频监控系统的影响因素研究[J].复旦公共行政评论,2023(2):20-45.
5张兴国,周英迪,石新雨,罗霄月,顾杨旸.一种球机视频全景拼接及空间化方法[J].测绘科学,2022,47(5):203-211. 被引量：1
6马素刚,赵祥模,侯志强,王忠民,孙韩林.一种基于ResNet网络特征的视觉目标跟踪算法[J].北京邮电大学学报,2020(2):129-134. 被引量：8
7汪辉,高尚兵,周君,周建,张莉雯.基于YOLOv3的多车道车流量统计及车辆跟踪方法[J].国外电子测量技术,2020,39(2):42-46. 被引量：15
8丁明远,蔡靖,周冕,薛彦兵,温显斌.跟踪状态自适应的判别式行人单目标跟踪算法研究[J].光电子．激光,2022,33(9):940-947. 被引量：1
9于长秋.论住房抵押贷款的证券化[J].金融理论与实践,2000(3):33-35. 被引量：1
10吴投文.论艺术家的孤独体验[J].湖北大学学报（哲学社会科学版）,2000,27(2):59-62. 被引量：7

1李毓茜.探析我国环境检测中地表水监测中存在的问题及对策[J].中文科技期刊数据库（全文版）自然科学,2024(2):0138-0141.
2唐瑞弦,龙囿霖,张娜,王鑫瑶,王心怡,郭琼,杜亮,李正赤,王峥.四川大学华西医院发表SCI论文被指南引用情况调查[J].中国循证医学杂志,2024,24(2):149-154.
3孙奕凡.高中信息技术教学情境下在线评测系统的迭代与融合[J].电脑编程技巧与维护,2023(10):86-89.
4邹景阳.一种结合Faceboxes与KCF的多尺度人脸检测算法[J].信息记录材料,2024,25(6):16-19.
5刘军锋,刘伟,王涛,邓德鑫,孟景涛,潘恒康.全空域相控阵测控系统下的目标跟踪方法[J].无线电工程,2024,54(8):2048-2052.
6莫文彬,宋向辉,蒋强.局部遮挡的人脸表情识别研究进展[J].信息技术与信息化,2024(8):216-220.
7陈璐,张溪,党晓圆,李洁,冯铁成.采用新型模型预测的光伏虚拟同步机低电压穿越控制[J].电源学报,2024,22(4):163-172.
8许剑平.小学数学课堂教学中关键节点的“慢处理”策略探究[J].中华传奇（下旬）,2022(30):0028-0030.
9金敏.基于虚拟现实技术的心理健康状况测评系统[J].信息技术,2023,47(11):17-21.
10陈旭,张旭辉,东为富.生物基粘合剂的研究进展[J].塑料包装,2024,34(4):9-18.

中国图象图形学报

2024年第8期

浏览历史

内容加载中请稍等...

单目标跟踪中的视觉智能评估技术综述

参考文献7

二级参考文献224

共引文献706

相关作者

相关机构

相关主题

浏览历史