深度学习行人检测方法综述被引量：17

An overview of deep learning based pedestrian detection algorithms

导出

摘要行人检测技术在智能交通系统、智能安防监控和智能机器人等领域均表现出了极高的应用价值,已经成为计算机视觉领域的重要研究方向之一。得益于深度学习的飞速发展,基于深度卷积神经网络的通用目标检测模型不断拓展应用到行人检测领域,并取得了良好的性能。但是由于行人目标内在的特殊性和复杂性,特别是考虑到复杂场景下的行人遮挡和尺度变化等问题,基于深度学习的行人检测方法也面临着精度及效率的严峻挑战。本文针对上述问题,以基于深度学习的行人检测技术为研究对象,在充分调研文献的基础上,分别从基于锚点框、基于无锚点框以及通用技术改进(例如损失函数改进、非极大值抑制方法等)3个角度,对行人检测算法进行详细划分,并针对性地选取具有代表性的方法进行详细结合和对比分析。本文总结了当前行人检测领域的通用数据集,从数据构成角度分析各数据集应用场景。同时讨论了各类算法在不同数据集上的性能表现,对比分析各算法在不同数据集中的优劣。最后,对行人检测中待解决的问题与未来的研究方法做出预测和展望。如何缓解遮挡导致的特征缺失问题、如何应对单一视角下尺度变化问题、如何提高检测器效率以及如何有效利用多模态信息提高行人检测精度,均是值得进一步研究的方向。 Computer vision technology has been intensively developed nowadays and it is essential to facilitate image classification and human face identification.Machine learning based methods have been used as basic technologies to carry out computer vision tasks.The core of this technology is to distinguish the location and category of the target via manual image feature designation for targeted tasks.However,the manual design process is costly.Current emerging deep learning-based technology can automatically learn effective features from labeled or unlabeled data in a supervised or unsupervised manner and facilitate image recognition and target detection tasks.Deep learning based pedestrian detection technology is one of the aspects its development.Our pedestrian detection is to identify pedestrian targets in a scenario of input single frame image or image sequence and determine the localization of the pedestrians in the targeted image.Due to the complicated scenarios and the uniqueness of pedestrian targets,deep learning based pedestrian detection technology has challenged two key issues shown below:1)one aspect is the occlusion issue.The other one is that,the human body structure information of pedestrians is severely affected in the case of severe occlusion.As a result,the visual features of the occluded pedestrians are differentiated from those of the un-occluded ones leading to false negatives during inference.Due to the diversity of occlusion patterns,it is challenged to analyze which part is occluded accurately,and locates on-site capability for pedestrian detection algorithms;2)the other challenge is scale-based variance.The pedestrians’detection status is constrained of crowded or sparse scenariol.For a tiny target,due to the lack of sufficient semantic information,the detector is likely to misjudge it as background noise.Simultaneously,it is challenged for a set of clear anchors that can match it perfectly for a large-scale target during the training procedure.Moreover,large-scale pedestrian instances often have clear internal texture and skeleton features,while small-scale ones often only have blurred edge information.Therefore,a unified framework designation is required to for large and small targets both.Our research carries out an overview of related works on several of deep learning-based pedestrian detection algorithms.Our analysis is targeted on current improvement of the mainstream pedestrian detection framework from three aspects,including anchor-based algorithm,anchor-free algorithm and technology modification(e.g.,loss function and non-maximum suppression).In the scope of anchor-based methods,this research is mainly focused on pedestrian detectors based on Faster region-based convolutional neural network(R-CNN)or single shot multibox detector(SSD)baseline,in which region proposals are firstly to generate and refined to get the final detection subsequently.In the context of these algorithms,current designation is for customized pedestrian modules whether it is based on single-stage or two-stage anchor-based detectors.We summarize them into the categories as following:1)partial-based methods:local part features contain more pedestrian occlusion and deformation information,and thus some methods like occlusion-aware R-CNN(OR-CNN)have investigated to extract part-level features to improve occluded pedestrian detection performance.In addition to using extra part detectors or delineating partial regions manually,several pedestrian detection methods like mask-guided attention network(MGAN)use the attention mechanism to enhance the features of visible pedestrian regions while suppressing the features of occluded ones.2)Hybrid methods:such methods like Bi-box or PedHunter built two-branch networks for both part and full-body prediction,and introduce a fusion mechanism to ensure more robustness on the aspects of local and global features of pedestrians both.3)Cascaded methods:to improve localization quality,cascade structure has been also applied for pedestrian detection.Cascade R-CNN,auto regressive network(AP-Ped)and asymptotic localization fitting network(ALFNet)stacked multiple head predictors for multi-stage regressions of the proposals,and thus the pedestrian detection boxes can be gradually refined to obtain optimized localization results.4)Multi-scale methods:these methods are integrated to robust feature representation by fusing high-level and lowlevel features like feature pyramid network(FPN)to tackle with scale variance in pedestrian detection.In the scope of anchor-free methods,our demonstration illustrates the two detectors like point-based,center scale predictor(CSP)and line-based,topology localization(TLL).Our two methods do not use the pre-defined anchor boxes and thus split into the anchor-free paradigm.These anchor-free methods can avoid the redundant background information brought by the predefined boxes,so it has relatively better performance for small-scale and occluded pedestrian detection.In addition,our research also summarizes improvements in general technologies that can be used in both anchor-based and anchor-free detectors.The modification of loss function represented by repulsion loss(RepLoss)is designed to bring the proposal and its matched ground-truth box closer while keeping it away from other ground-truth boxes.Another key technique is non-maximum suppression(NMS),which is usually used to reduce duplicated detection results.Representative methods among them are adaptive NMS and R2NMS,and they usually aim to find a more suitable post-processing threshold for the pedestrian detector to deal with the occlusion issue.The regular datasets like Caltech,Citypersons and its corresponding challenging subsets(e.g.,reasonable and heavy)are introduced in details.On the basis of the evaluation metric of log-average miss rate,our overview promotes a comparison of the performance on different subsets targeting at various challenging tasks,and provides an experimental analysis.

作者罗艳张重阳田永鸿郭捷孙军 Luo Yan;Zhang Chongyang;Tian Yonghong;Guo Jie;Sun Jun(School of Electronic Information and Electrical Engineering,Shanghai Jiao Tong University,Shanghai 200240,China;School of Electronics Engineering and Computer Science,Peking University,Beijing 100871,China;School of Cyber Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China)

机构地区上海交通大学电子信息与电气工程学院北京大学信息科学技术学院上海交通大学网络空间安全学院

出处《中国图象图形学报》 CSCD 北大核心 2022年第7期2094-2111,共18页 Journal of Image and Graphics

基金国家重点研发计划资助(2017YFB1002400) 国家自然科学基金项目(61971281) 上海市重点实验室项目(18DZ2270700)。

关键词行人检测深度学习卷积神经网络(CNN) 遮挡目标检测小目标检测 pedestrian detection deep learning convolutional neural network(CNN) occlusion target detection smallscale target detection

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献74

1刘毅,于畅洋,李国燕,潘玉恒.UAST-RCNN:遮挡行人的目标检测算法[J].电子测量与仪器学报,2022,36(12):168-175. 被引量：10
2徐印赟,江明,李云飞,吴云飞,卢桂馥.基于改进YOLO及NMS的水果目标检测[J].电子测量与仪器学报,2022,36(4):114-123. 被引量：24
3种衍文,匡湖林,李清泉.一种基于多特征和机器学习的分级行人检测方法[J].自动化学报,2012,38(3):375-381. 被引量：28
4熊聪,王文武.基于DPM模型的行人检测技术的研究[J].电子设计工程,2014,22(23):172-173. 被引量：4
5吴帅,徐勇,赵东宁.基于深度卷积网络的目标检测综述[J].模式识别与人工智能,2018,31(4):335-346. 被引量：87
6陶祝,刘正熙,熊运余,李征.基于深度神经网络的行人头部检测[J].计算机工程与科学,2018,40(8):1475-1481. 被引量：8
7刘丹,马同伟.结合语义信息的行人检测方法[J].电子测量与仪器学报,2019,31(1):54-60. 被引量：14
8白云,侯媛彬.煤矿救援蛇形机器人的研制与控制[J].西安科技大学学报,2018,38(5):800-808. 被引量：10
9李闯,陈张平,王坚,张波涛.基于优化HOG特征计算的非完整人体特征检测[J].计算机测量与控制,2018,26(11):238-242. 被引量：2
10徐诚极,王晓峰,杨亚东.Attention-YOLO:引入注意力机制的YOLO检测算法[J].计算机工程与应用,2019,55(6):13-23. 被引量：69

引证文献17

1郭志坚,李江勇,祁海军,赵金博.基于改进YOLOv4的红外行人车辆检测算法[J].激光与红外,2023,53(4):607-614. 被引量：3
2娄翔飞,吕文涛,叶冬,郭庆,鲁竞,陈影柔.基于计算机视觉的行人检测方法研究进展[J].浙江理工大学学报（自然科学版）,2023,49(3):318-330. 被引量：4
3丁正彦,尚岩峰,张重阳.渐进式迭代优化的行人属性识别[J].中国图象图形学报,2023,28(5):1487-1498.
4高强,唐福兴,李栋,吉月辉,刘俊杰,史涛,苏艳杰.基于改进YOLOv5的密集场景行人检测方法研究[J].国外电子测量技术,2023,42(4):125-130. 被引量：7
5张宏扬.基于深度学习的遮挡行人检测研究[J].信息技术与信息化,2023(6):217-220. 被引量：1
6郝帅,杨晨禄,赵秋林,马旭,孙曦子,王海莹,孙浩博,吴瑛琦.基于双分支头部解耦和注意力机制的灾害环境人体检测[J].西安科技大学学报,2023,43(4):797-806. 被引量：1
7张阳,张帅锋,刘伟铭.融合残差网络和特征金字塔的小尺度行人检测方法[J].交通信息与安全,2023,41(3):111-118.
8卢嫚,刘秀平,冯国栋.基于YOLOv5融合注意力机制的轻量级行人检测算法研究[J].国外电子测量技术,2023,42(8):96-101. 被引量：2
9朱锦雷,李艳凤,陈后金,孙嘉,潘盼.近邻优化跨域无监督行人重识别算法[J].中国图象图形学报,2023,28(11):3471-3484.
10刘嘉泽,王超,生龙.基于YOLOv5的行人检测方法研究[J].电脑与信息技术,2024,32(1):37-41. 被引量：1

二级引证文献16

1谢志成,何冬黎,杨永政,李君茂,黄子露,王洪波.基于改进的Yolov7的西红柿果实计数算法研究[J].内蒙古农业大学学报（自然科学版）,2024,45(1):48-56. 被引量：1
2周科宇,李军.基于深度学习的目标检测研究进展[J].单片机与嵌入式系统应用,2023,23(7):38-40. 被引量：1
3卢嫚,刘秀平,冯国栋.基于YOLOv5融合注意力机制的轻量级行人检测算法研究[J].国外电子测量技术,2023,42(8):96-101. 被引量：2
4李强,庄莉,王秋琳,张帅,陈锴.多标签多目标融合下国产AI芯片的行人目标检测技术[J].单片机与嵌入式系统应用,2023,23(12):27-30.
5马永航,林志诚.基于ResNet的轻量化视频行为识别方法[J].移动信息,2024,46(1):204-206.
6朱希,李燕,施林枫.基于深度学习的密集物料检测方法[J].国外电子测量技术,2024,43(1):151-158.
7谢智慧,王文爽,刘雪峰.基于改进U-Net的金具图像小样本识别算法研究[J].国外电子测量技术,2024,43(2):51-58.
8江晟,王博文,许文娟.基于改进YOLOv5s的图像融合交通检测方法[J].长春理工大学学报（自然科学版）,2024,47(2):66-74.
9王亚鹏,韩文花.改进YOLOv5算法下的无人驾驶道路行人识别研究[J].国外电子测量技术,2024,43(6):170-178.
10薛继伟,薛鹏杰,胡馨元.基于改进YOLOv5的行人检测方法研究[J].重庆理工大学学报（自然科学）,2024,38(7):101-109.

1徐宏宇.数据抓取行为的认定与规制——基于首例大数据不正当竞争案[J].全国流通经济,2022(19):128-131. 被引量：2
2綦萌,牛雄鹰,董玉杰,杨早立.领导-员工交换一致性对员工组织公民行为的影响机理研究[J].管理评论,2022,34(6):215-225. 被引量：2
3边晓慧,徐童.重大突发公共卫生事件下的公众情感演进分析:基于新冠肺炎疫情的考察[J].数据分析与知识发现,2022,6(7):128-140. 被引量：10

中国图象图形学报

2022年第7期

浏览历史

内容加载中请稍等...

深度学习行人检测方法综述被引量：17

同被引文献74

引证文献17

二级引证文献16

相关作者

相关机构

相关主题

浏览历史

深度学习行人检测方法综述 被引量：17

同被引文献74

引证文献17

二级引证文献16

相关作者

相关机构

相关主题

浏览历史

深度学习行人检测方法综述被引量：17