期刊文献+

基于可变形卷积的改进YOLO目标检测算法 被引量:15

Improved YOLO Object Detection Algorithm Based on Deformable Convolution
下载PDF
导出
摘要 针对YOLO目标检测算法存在边界框定位不准确及对小目标检测精度低的问题,提出一种改进的YOLO目标检测算法dcn-YOLO。使用k-means++算法聚类出更符合数据集尺寸的锚盒,以降低初始点对聚类结果的影响并加快网络训练收敛速度。构建残差可变形卷积模块res-dcn,分别采用将其嵌入YOLO第一特征提取头模块中和替换3个YOLO特征提取头模块的方式,构建两种改进的dcn-YOLO算法,使网络可以自适应地学习特征点的感受野,从而对不同尺寸和形状的目标提取更有效的特征,提高检测精度。在VOC数据集上的实验结果表明,该算法能有效提高目标检测精度,mAP达到82.6%,相比YOLO、SSD、Faster R-CNN,分别高出了2.1、5.2、9.4个百分点。 The YOLO algorithm for object detection is limited by the inaccurate positioning of the boundary box and the low detection accuracy for small objects.To address the problem,an improved YOLO algorithm,dcn-YOLO,is proposed based on deformable convolution for object detection.The algorithm employs the K-means++to cluster anchor boxes that are more in line with the size of data set,so as to reduce the impact of initial points on clustering results and speed up the convergence of network training.Then,a residual deformable convolution module,res-dcn,is constructed.Two improved dcn-YOLO algorithms are derived by embedding res-dcn in the first YOLO feature extraction head module or replacing three YOLO feature extraction head modules with res-dcn,so the network can adaptively learn the receptive field of feature points and extract more effective features for objects of different sizes and shapes,increasing the detection accuracy.Experimental results on VOC data sets show that the propose algorithm can effectively improve the object detection accuracy.Its mAP reaches 82.6%,which is 2.1 percentage points higher than that of YOLO,5.2 percentage points higher than that of SSD and 9.4 percentage points higher than that of Faster R-CNN.
作者 黄凤琪 陈明 冯国富 HUANG Fengqi;CHEN Ming;FENG Guofu(Institute of Information Technology,Shanghai Ocean University,Shanghai 201306,China)
出处 《计算机工程》 CAS CSCD 北大核心 2021年第10期269-275,282,共8页 Computer Engineering
基金 国家重点研发计划(2018YFD0701003) 上海市科技创新行动计划(6391902902)。
关键词 YOLO算法 目标检测 感受野 可变形卷积 k-means++算法 YOLO algorithm object detection receptive field deformable convolution k-means++algorithm
  • 相关文献

参考文献6

二级参考文献29

  • 1侯志强,韩崇昭.视觉跟踪技术综述[J].自动化学报,2006,32(4):603-617. 被引量:255
  • 2Dalal N,Triggs B.Histograms of oriented gradients forhuman detection[C]//Proceedings of the 2005 IEEE InternationalConference on Computer Vision and Pattern Recognition.Washington,DC:IEEE Computer Society,2005,1:886-893.
  • 3Wu B,Nevatia R.Optimizing discrimination-efficiencytradeoff in integrating heterogeneous local features forobject detection[C]//Proceedings of the 2008 IEEE InternationalConference on Computer Vision and PatternRecognition.Washington,DC:IEEE Computer Society,2008:1-8.
  • 4Viola P,Jones M.Rapid object detection using a boostedcascade of simple features[C]//Proceedings of CVPR2001,Kauai,HI,USA,2001:511-518.
  • 5Serre T,Wolf L,Bileschi S,et al.Object recognition withcortex-like mechanisms[J].IEEE Transactions on PatternAnalysis and Machine Intelligence,2007,29(3):411-428.
  • 6Ye Q,Liang J,Jiao J.Pedestrian detection in video imagesvia error correcting output code classification of manifoldsubclasses[J].IEEE Transactions on Intelligent TransportationSystems,2012,13(1):193-202.
  • 7Munder S,Gavrila D M.An experimental study on pedestrianclassification[J].IEEE Transactions on Pattern Analysisand Machine Computer Vision,2006,28(11):1863-1868.
  • 8Wu B,Nevatia R.Cluster boosted tree classifier for multiview,multi-pose object detection[C]//Proceedings of the11th IEEE International Conference on Computer Vision.Washington,DC:IEEE Computer Society,2007:1-8.
  • 9Bengio Y.Learning deep architectures for AI[J].Foundationsand Trends in Machine Learning,2009,2(1):1-71.
  • 10Dahl G E,Yu D,Deng L,et al.Context-dependent pretraineddeep neural networks for large-vocabulary speechrecognition[J].IEEE Trans on Audio Speech and LanguageProcessing,2012,20(1):30-42.

共引文献239

同被引文献145

引证文献15

二级引证文献111

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部