摘要
轻量化卷积神经网络的出现促进了基于深度学习的语义分割技术在低功耗移动设备上的应用.然而,轻量化卷积神经网络一般不考虑融合特征之间的关系,常使用线性方式进行特征融合,网络分割精度有限.针对该问题,提出一种基于编码器-解码器架构的轻量化卷积注意力特征融合网络.在编码器中,基于MobileNetv2给出空洞MobileNet模块,以获得足够大的感受野,提升轻量化主干网络的表征能力;在解码器中,给出卷积注意力特征融合模块,通过学习特征平面通道、高度和宽度3个维度间的关系,获取不同特征平面之间的相对权重,并以此对特征平面进行加权融合,提升特征融合的效果.所提网络仅有0.68×106参数量,在未使用预训练模型、后处理和额外数据的情况下,使用NVIDIA 2080Ti显卡在城市道路场景数据集Cityscapes和CamVid上进行实验的结果表明,该网络的平均交并比分别达到了72.7%和67.9%,运行速度分别为86帧/s和105帧/s,在分割精度、网络规模与运行速度之间达到了较好的平衡.
Recently reported lightweight networks have promoted the application of real-time semantic seg-mentation on mobile platforms.However,the linear combination operation performed in lightweight net-works do not consider the relationship between fused features,resulting in limited segmentation accuracy.To solve this dilemma,a lightweight network with convolutional attention feature fusion based on en-coder-decoder architecture is proposed in this paper.In the encoder,a dilated MobileNet block is given based on MobileNetv2 to create sufficient receptive fields and enhance representation ability of the lightweight backbone.In the decoder,convolutional attention feature fusion module is given.Relative attention weights that contain interactions between channel,height and width are used to aggregate feature maps.Specifically,without a pretrained model,postprocessing or extra data,the lightweight network with convolutional atten-tion feature fusion has only 0.68 million parameters and achieves a 72.7%mean intersection over union on the Cityscapes dataset with a speed of 86 frames per second and a 67.9%mean intersection over union on the Camvid dataset with a speed of 105 frames per second on a single 2080Ti GPU.The comprehensive experi-ments demonstrate that our model achieves favorable trade-off between accuracy,model size and speed.
作者
董荣胜
刘意
马雨琪
李凤英
Dong Rongsheng;Liu Yi;Ma Yuqi;Li Fengying(Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 541004)
出处
《计算机辅助设计与图形学学报》
EI
CSCD
北大核心
2023年第6期935-943,共9页
Journal of Computer-Aided Design & Computer Graphics
基金
国家自然科学基金(62062029,61762024).
关键词
实时语义分割
轻量化卷积神经网络
注意力机制
特征融合
real-time semantic segmentation
lightweight convolution neural network
attention mechanism
feature fusion