摘要
语义分割是一种像素到像素的图像分类任务。而现有方法在处理此类问题时,往往忽略了不同图像之间类别分布的相似性。因此,该文提出了一种基于特征累积的语义分割网络,能够在训练的过程中,使用动量累积的方式,根据每张图像上不同的类别特征,拟合整个数据集的特征分布。除此之外,为了适应通用数据集中复杂的自然环境,该网络尝试对同一类别的特征聚类进行更深层次的划分,并且取得了较好的效果。与同样使用特征通道信息的OCRNet相比,使用ResNet作为骨干网络的情况下,该模型在Pascal Context数据集上平均交并比(mIoU)和平均准确度(mAcc)分别可以提升0.34%和0.64%;实验证明了这一优势在不同数据集、不同骨干网络中同样存在。
Semantic segmentation is a basic computer vision task which is aimed at pixel-to-pixel classification. However,the existing researches often pay too much attention to the current image information and ignore the feature distribution of the dataset. To deal with this problem,a semantic segmentation network is proposed based on accumulation of category features. During the training process,the network can use the momentum accumulation method to fit the feature distribution of the whole data set according to the different category features of each image. Moreover,in order to adapt to the complex natural environment in the general datasets,the feature clustering of the same category is furtherly divided,which has achieved a better result. Compared with OCRNet,which also captures channels’ distribution,when ResNet is used as the backbone,our model can improve the mean intersection union(mIoU) by 0.34%,and pixel accuracy(mAcc) by 0.64% on Pascal Context dataset.Experiments show that this improvement also exists in different datasets and different backbones.
作者
史正一
孙力
SHI Zhengyi;SUN Li(School of Communication and Electronic Engineering,East China Normal University,Shanghai 200241,China)
出处
《电子设计工程》
2023年第6期144-148,共5页
Electronic Design Engineering
关键词
机器视觉
深度学习
语义分割
注意力算法
特征统计
动态平均
computer vision
deep learning
semantic segmentation
attention algorithm
feature accumulation
moving average