期刊文献+

融合多尺度空洞卷积与反卷积的轻量化目标检测 被引量:4

Lightweight Object Detection Combined with Multi-Scale DilatedConvolution and Multi-Scale Deconvolution
下载PDF
导出
摘要 深度神经网络存在目标检测速度慢、参数量大的问题,不适用于算力有限但速度要求较高的移动应用场景。为了提高目标检测的推理速度,有效权衡目标检测任务的精度与速度,文中提出了一种融合多尺度空洞卷积与反卷积的轻量化目标检测网络MDDNet。首先,基于高效的单阶段多目标检测策略设计了轻量的目标检测基础网络,并引入深度可分离卷积,以进一步减少基础网络的参数量,加快图像特征提取的速度;然后在主干网络中添加两条基于多尺度空洞卷积的特征扩展旁路,分别连接在基础网络的最末端和次末端残差层的输出端,将两条旁路的特征输出到预测层进行特征融合,以增强较低层特征图的纹理特征;并且进一步引入了多尺度反卷积模块,连接于深层特征网络层,以增大特征图尺寸,再融合具有不同尺度的上一层的浅层特征图,以获得更多的特征语义信息和细节信息,提高检测精度;最后在预测层基于K均值算法优化先验框参数,使其与目标真实框更匹配,提高目标识别的准确率。实验结果表明:MDDNet的参数量约为7.21×10^(6),平均检测精度在KITTI、Pascal VOC数据集上分别为58.7%、76.0%,推理速度在两个数据集上分别达到55和52 f/s。因此,MDDNet在参数量、检测速度和检测精度上达到了较佳的平衡,可适用于移动端的实时目标检测。 Due to the tough issues of slow detection and heavy parameters,the deep neural networks are inapplicable to be deployed on mobile application scenarios which are computing-resource-constrained but demand high speed calculation.To improve the inference speed for object detection and achieve a better tradeoff between detection accuracy and inference speed,this paper proposed a lightweight object detection network named MDDNet which combined multi-scale dilated-convolution and multi-scale deconvolution.Firstly,a lightweight detection backbone network was designed based on an efficient single-stage strategy,and the depthwise separable convolution was introduced to reduce the parameter amount of the baseline and further speed up the feature extraction.Secondly,two feature extension branches based on multi-scale dilated convolution were added to the backbone network,which were respectively connected to the ends of the final and the penultimate residual layers of the basic network.The features of the two branches were fused in the prediction layer to augment the texture features of the shallow feature maps.Thirdly,the multi-scale deconvolution module was further introduced and connected to the deep feature network layers to increase the size of the feature map,and then the shallow feature maps of the previous layer with different scales were fused so as to enrich the feature semantic information and the detailed information,improving the detection accuracy.Finally,the parameters of the prior bounding box were optimized in the prediction layer based on the K-means clustering method,so that the prior bounding box could better match the ground truth of the object,achieving higher object recognition accuracy.The experimental results show that the MDDNet produces about 7.21×10^(6) parameters.The average accuracy is 58.7%and 76.0%in KITTI and Pascal VOC datasets,respectively,while the corresponding inference speed respectively reaches 55 f/s and 52 f/s in the above two datasets.Therefore,MDDNet achieves a decent tradeoff among the parameter amount,detection speed,and detection accuracy,and it can be applied to real-time object detection on mobile terminals.
作者 易清明 吕人毅 石敏 骆爱文 YI Qingming;LÜRenyi;SHI Min;LUO Aiwen(College of Information Science and Technology,Jinan University,Guangzhou 510632,Guangdong,China;Techtotop Microeletronics Technology Co.,Ltd.,Guangzhou 510663,Guangdong,China)
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2022年第12期41-48,共8页 Journal of South China University of Technology(Natural Science Edition)
基金 国家自然科学基金资助项目(62002134) 广东省基础与应用基础研究基金资助项目(2020A1515110645) 新型半导体材料与器件广东省重点实验室项目(2021KSY001) 广州市创新领军人才项目(2019019) 暨南大学中央高校基本科研业务费专项资金资助项目(21620353) 暨大-泰斗联合培养研究生基地项目(82621176)。
关键词 目标检测 空洞卷积 反卷积 多尺度 精度-速度均衡 object detection dilated convolution deconvolution multi-scale accuracy-speed tradeoff
  • 相关文献

参考文献4

二级参考文献13

共引文献125

同被引文献49

引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部