目标检测中的尺度变换应用综述被引量：5

Scale changing in general object detection:a survey

导出

摘要目标检测试图用给定的标签标记自然图像中出现的对象实例,已经广泛用于自动驾驶、监控安防等领域。随着深度学习技术的普及,基于卷积神经网络的通用目标检测框架获得了远好于其他方法的目标检测结果。然而,由于卷积神经网络的特性限制,通用目标检测依然面临尺度、光照和遮挡等许多问题的挑战。本文的目的是对卷积神经网络架构中针对尺度的目标检测策略进行全面综述。首先,介绍通用目标检测的发展概况及使用的主要数据集,包括通用目标检测框架的两种类别及发展,详述基于候选区域的两阶段目标检测算法的沿革和结构层面的创新,以及基于一次回归的目标检测算法的3个不同的流派。其次,对针对检测问题中影响效果的尺度问题的优化思路进行简单分类,包括多特征融合策略、针对感受野的卷积变形和训练策略的设计等。最后,给出了各个不同检测框架在通用数据集上对不同尺寸目标的检测准确度,以及未来可能的针对尺度变换的发展方向。 General object detection has been one of most important research topics in the field of computer vision.This task attempts to locate and mark an object instance that appears in a natural image using a series of given labels.The technique has been widely used in actual application scenarios,such as automatic driving and security monitoring.With the development and popularization of deep learning technology,the acquisition of the semantic information of images has become easier;thus,the general object detection framework based on convolutional neural networks(CNNs)has obtained better results compared with other target detection methods.Given that the large-scale dataset of the task is relatively better than datasets designed for other vision tasks and the metrics are well defined,this task rapidly evolves in CNN-based computer vision tasks.However,general object detection tasks still face many problems,such as scale and illumination changes and occlusions,due to the limitations of the CNN structure.Given that the features extracted by CNNs are sensitive to the scale,multiscale detection is often valuable but challenging in the field of CNN-based target detection.Research on scale transformation also has reference value for other scales in small target-or pixel-level tasks,such as the semantic segmentation and pose detection of images.This study mainly aims to provide a comprehensive overview of object detection strategies for scales in CNN architectures,that is,how to locate and classify different sizes of targets robustly.First,we introduce the development of general target detection problems and the main datasets used.Then,we introduce two categories of the gen eral object detection framework.One of the categories,i.e.,two-stage strategies,first obtains the region proposals and then selects the proposals by points of classification confidence;it mostly takes region-based convolutional neural networks(RCNN)as the baseline.With the development of the RCNN structure,all the links are transformed into specific convolution layers,thus forming an end-to-end structure.In addition,several tricks are designed for the baseline to solve specific problems,thus improving the robustness of the baseline for all kinds of object regions.The other category,i.e.,one-stage strategies,obtains the region location and category by regressing once;it starts with a structure named“you only look once”which regresses the information of the object for every block divided.Then,the baseline becomes convolutional and end to end and uses deep and effective features.This baseline has also become popular since focal loss has been proposed because it solves the problem in which regression may cause an unbalance of positive and negative samples.Besides,some other methods,which detect objects via point location and learn from pose estimation tasks,also obtain satisfactory results in general target detection.We then introduce a simple classification of the optimization ideas for scale problems;these ideas include multi-feature fusion strategies,convolution deformations for receptive fields,and training strategy designs.Multifeature fusion strategies are used to detect the classes of objects that are not always performed in a small scale.Multi-feature fusion can obtain semantic information from different image scales and fuse them to attain the most suitable scale.It can also effectively identify the different sizes of one-class objects.Widely used structures can be divided as follows:those that use single-shot detection and those with feature pyramid networks.Some structures have a jump layer fusion design.In a receptive field,every feature corresponds with an image or lower-level feature.The specific design can solve a target that always appears small in the image.The general receptive field of a convolution is the same as the size of the kernel;another special convolution kernel is designed.Dilated kernels are the most deformed kernels,which are used with the designed pooling layer to obtain a dense high-level feature.Some scholars have designed an offset layer to attain the most useful deformation information automatically for the convolution kernel.A training strategy can also be designed for small targets.A dataset that only includes small objects can be designed,and different sizes of the image can be trained in the structure in an orderly manner.Resampling images is also a common strategy.We provide the detection accuracy results for different sizes of targets on common datasets for different detection frameworks.Results are obtained from the Microsoft common objects in context(MS COCO)dataset.We use average precision(AP)to measure the result of the detection,and the result set includes results for small,medium,and large targets and those for different intersection-over-union thresholds.It shows the influence of the changes for scale.This study provides a set of possible future development directions for scale transformation.It also includes strategies on how to obtain robust features and detection modules and how to design a training dataset.

作者申奉璨张萍罗金刘松阳冯世杰 Shen Fengcan;Zhang Ping;Luo Jin;Liu Songyang;Feng Shijie(School of Optoelectronic Science and Engineering,University of Electronic Science and Technology of China,Chengdu 610054,China)

机构地区电子科技大学光电科学与工程学院

出处《中国图象图形学报》 CSCD 北大核心 2020年第9期1754-1772,共19页 Journal of Image and Graphics

基金四川省科技计划项目(2018GZ0166,2019YFG0307)。

关键词图像语义理解通用目标检测卷积神经网络尺度变换小目标检测 image semantic understanding general object detection convolutional neural network(CNN) scale changing small target detection

分类号 TP751.1 [自动化与计算机技术—检测技术与自动化装置]

引文网络
相关文献

同被引文献36

1陶新民,常瑞,沈微,王若彤,李晨曦.基于低密度分割几何距离的半监督KFDA算法[J].软件学报,2020,31(2):493-510. 被引量：3
2万磊,佟鑫,盛明伟,秦洪德,唐松奇.Softmax分类器深度学习图像分类方法应用综述[J].导航与控制,2019,0(6):1-9. 被引量：62
3瞿端阳,王维新,马本学,丁志锋.基于颜色特征的棉株株顶识别研究[J].农机化研究,2013,35(4):40-43. 被引量：6
4迟德霞,王洋,宁立群,衣娟.张正友法的摄像机标定试验[J].中国农机化学报,2015,36(2):287-289. 被引量：32
5卢宏涛,张秦川.深度卷积神经网络在计算机视觉中的应用研究综述[J].数据采集与处理,2016,31(1):1-17. 被引量：551
6张自启,吕昊,陈扶明,安强,祁富贵,王健琪.UWB生物雷达多静止人体目标呼吸检测中“遮蔽效应”的实验研究[J].医疗卫生装备,2017,38(4):1-5. 被引量：10
7周飞燕,金林鹏,董军.卷积神经网络研究综述[J].计算机学报,2017,40(6):1229-1251. 被引量：1726
8胡炎,单子力,高峰.基于Faster-RCNN和多分辨率SAR的海上舰船目标检测[J].无线电工程,2018,48(2):96-100. 被引量：36
9施泽浩,赵启军.基于全卷积网络的目标检测算法[J].计算机技术与发展,2018,28(5):55-58. 被引量：19
10郝延杰,石磊,曹龙龙,宫建勋,张爱民.棉花顶芽识别定位技术研究现状及展望[J].中国农机化学报,2018,39(11):72-78. 被引量：12

引证文献5

1张艺,严翌瑄,李静.基于多传感器融合的交通数据采集系统概述[J].物联网技术,2021,11(2):15-18. 被引量：12
2孙想,吴华瑞,朱华吉,杨雨森,陈诚,何思琪,王春山.基于LW-YOLOv3模型的棉花主茎生长点检测与定位研究[J].河北农业大学学报,2021,44(6):106-115. 被引量：5
3张仲圆,刘淑慧.基于深度学习的原位电镜图像纳米颗粒识别与追踪[J].电子设计工程,2022,30(21):85-89. 被引量：1
4赵思肖,梁步阁,杨德贵,熊明耀.基于Faster-RCNN的IR-UWB穿墙雷达邻近多目标检测算法[J].无线电工程,2023,53(1):80-86. 被引量：1
5赵杰,汪志成,黄南海,王哲.基于双目视觉的物料三维空间定位算法[J].科学技术与工程,2023,23(18):7861-7867. 被引量：3

二级引证文献22

1蔡长霖,王黎明,李强,韩星程.毫米波雷达点云的高速公路隧道车辆跟踪[J].单片机与嵌入式系统应用,2021,21(10):68-70. 被引量：2
2刘磊,周虎,韩美林,周佩.生态环境监测中多传感器的数据融合算法[J].湖北理工学院学报,2021,37(6):21-25. 被引量：8
3李敏,简立明,刘春花,刘欢.基于WiFi的移动式环境信息监控系统设计[J].电子产品世界,2021,28(12):99-102.
4王康威,叶前程,蒋丽珍.危险驾驶行为预警与求救系统的设计与实现[J].物联网技术,2022,12(1):107-111. 被引量：2
5刘元昊,张昱,刘杲朋,张瀚坤,荆林立,樊兆董.基于雷视一体机的交通流数据采集系统设计[J].计算机测量与控制,2022,30(3):161-167. 被引量：7
6李超凡,晏磊,代振飞,丁庆安,李俊凯,程旭东.基于鸿蒙OS的多源数据融合水质监测系统设计[J].物联网技术,2022,12(9):13-16. 被引量：6
7吴建清,王其峰,厉周缘,田源.互通式立交风险冲突识别与预警综述[J].山东大学学报（工学版）,2022,52(6):1-13. 被引量：2
8刘广,胡国玉,古丽巴哈尔·托乎提,赵腾飞,董娅兰.基于改进YOLOv3的葡萄叶部病虫害检测方法[J].微电子学与计算机,2023,40(2):110-119. 被引量：5
9李晓丽.无线传感融合下的水肥一体机电气系统设计[J].农机化研究,2023,45(9):100-105.
10彭兴鹏,何秀文,黄巍,孙云涛,刘仁鑫,郑悦,黄俊宇.基于YOLOv3的病死猪猪头的识别方法[J].南方农机,2023,54(9):14-17.

1高学东,王艾.基于企业网络舆情的客户满意度分析及管理方法[J].运筹与管理,2020,29(7):232-239. 被引量：6
2库涛,熊艳彬,杨楠,林乐新,朱珠.基于全局交互的图像语义理解方法[J].控制与决策,2020,35(9):2103-2111. 被引量：3
3贺怀清,王进,惠康华,陈琴.基于YOLO的多尺度并行人脸检测算法[J].计算机工程与设计,2020,41(9):2559-2565. 被引量：6
4童晓薇,曾思通,林潇丽.基于目标对象表示学习的立场判定[J].新乡学院学报,2020,37(9):20-24.

中国图象图形学报

2020年第9期

浏览历史

内容加载中请稍等...

目标检测中的尺度变换应用综述被引量：5

同被引文献36

引证文献5

二级引证文献22

相关作者

相关机构

相关主题

浏览历史

目标检测中的尺度变换应用综述 被引量：5

同被引文献36

引证文献5

二级引证文献22

相关作者

相关机构

相关主题

浏览历史

目标检测中的尺度变换应用综述被引量：5