摘要
针对城市街景数据集中存在小目标和大量长条形状物体,分割难度大,虽然目前编码解码结构的网络能细化分割结果,但大多数都没有充分利用空间和上下文信息,因此本文提出一种基于像素注意力特征融合的语义分割算法。首先以ResNet50作为骨干网络,利用空洞空间卷积池化金字塔和条状池化进行初步特征融合,获得多尺度特征的同时规避无用信息;然后利用像素融合注意力模块,聚合上下文信息并恢复空间信息,最后利用注意力特征细化模块消除冗余信息。该算法在CamVid数据集上进行实验,结果表明该算法在验证集上能达到75.22%的mIoU,在测试集上也能达到67.21%。相比于DeepLabv3+网络分别提升了2.51%和2.86%。
For the presence of small targets and a large number of long bar-shaped objects in urban streetscape datasets,segmentation is difficult,and although current networks with coding and decoding structures can refine segmentation results,most of them do not make full use of spatial and contextual information,so this paper proposes a semantic segmentation algorithm based on pixel attention feature fusion.Firstly,using ResNet50 as the backbone network,the initial feature fusion is carried out using the null space convolutional pooling pyramid and strip pooling to obtain multiscale features while circumventing useless information;then the pixel fusion attention module is used to aggregate contextual information and recover spatial information,and finally the attention feature refinement module is used to eliminate redundant information.The algorithm was experimented on the CamVid dataset and the results showed that the algorithm was able to achieve 75.22%mIoU on the validation set and 67.21%on the test set.This is an improvement of 2.51%and 2.86%respectively compared to the DeepLabv3+network.
作者
李利荣
丁江
梅冰
戴俊伟
巩朋成
Li Lirong;Ding Jiang;Mei Bing;Dai Junwei;Gong Pengcheng(School of Electrical and Electronic Engineering,Hubei University of Technology,Wuhan 430068,China;Hubei Power Grid Intelligent Control and Equipment Engineering Technology Research Center,Hubei University of Technology,Wuhan 430068,China;Shool of Computer Science and Engineering,Wuhan Engineering University,Wuhan 430205,China)
出处
《电子测量技术》
北大核心
2023年第20期184-190,共7页
Electronic Measurement Technology
基金
国家自然科学基金(62071172,62202148)项目资助
关键词
城市街景
像素融合
注意力机制
条状池化
语义分割
urban streetscape
pixel fusion
attention mechanism
strip ponding
semantic segmentation