摘要
随着深度卷积神经网络的快速发展,基于深度学习的目标检测方法由于具有良好的特征表达能力及优良的检测精度,成为当前目标检测算法的主流.为了解决目标检测中小目标漏检问题,往往使用多尺度处理方法.现有的多尺度目标检测方法可以分为基于图像金字塔的方法和基于特征金字塔的方法.相比于基于图像金字塔的方法,基于特征金字塔的方法速度更快,更能充分利用不同卷积层的特征信息.现有的基于特征金字塔的方法采用对应元素相加的方式融合不同尺度的特征图,在特征融合过程中易丢失低层细节特征信息.针对该问题,本文基于特征金字塔网络(featurepyramidnetwork,FPN),提出一种多层特征图堆叠网络(multi-featureconcatenationnetwork,MFCN)及其目标检测方法.该网络以FPN为基础,设计多层特征图堆叠结构,通过不同特征层之间的特征图堆叠融合高层语义特征和低层细节特征,并且在每个层上进行目标检测,保证每层可包含该层及其之上所有层的特征信息,可有效克服低层细节信息丢失.同时,为了能够充分利用ResNet101中的高层特征,在其后添加新的卷积层,并联合其低层特征图,提取多尺度特征.在PASCALVOC2007数据集上的检测精度为80.1%m AP,同时在PASCALVOC2012和MSCOCO数据集上的表现都优于FPN算法.相比于FPN算法,MFCN的检测性能更加优秀.
With the rapid development of deep the convolutional neural network,mainstream methods for object detection have been based on deep learning owing to its superior feature representation and excellent detection accuracy.To omit small objects in object detection,a multi-scale algorithm is usually adopted.Existing multi-scale object detection methods can be categorized as image pyramid-based or feature pyramid-based.Compared with the image pyramid-based method,the feature pyramid-based method is faster and better able to take full advantage of the feature information of different convolution layers. The existing feature pyramid-based method fuses feature maps from different scales by adding corresponding elements,which often results in loss of some detailed low-level feature information.To tackle this problem,this paper proposes a multi-feature concatenation network(MFCN)based on a feature pyramid network(FPN). A structure-performing,multi-layer feature map concatenation was designed. Semantic high-level features and detailed low-level features were fused by concatenating feature maps from different feature layers.Objects on each layer were detected to ensure that each layer could contain the feature information of the layer and all layers above it,effectively overcoming the loss of detailed low-level information.To make full use of the high-level features in ResNet101,a new convolutional layer was added and combined with the low-level feature map to extract multi-scale features.Results of the new design showed that detection accuracy on the PASCAL VOC 2007 dataset was 80.1% mAP,and the performance on PASCAL VOC 2012 and MS COCO datasets was superior to that on an FPN.Compared with the FPN,the detection performance of MFCN is even better.
作者
杨爱萍
鲁立宇
冀中
Yang Aiping;Lu Liyu;Ji Zhong(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
出处
《天津大学学报(自然科学与工程技术版)》
EI
CSCD
北大核心
2020年第6期647-652,共6页
Journal of Tianjin University:Science and Technology
基金
国家自然科学基金资助项目(61771329,61632018).
关键词
特征金字塔网络
目标检测
特征图堆叠
语义信息
feature pyramid network
object detection
feature concatenation
semantic information