摘要
针对现有的小样本语义分割模型对未知新类分割精度不高的问题,提出一种基于元学习的小样本语义分割算法.首先,利用深度可分离卷积改进传统主干网络,并在ImageNet数据集上进行了编码器的预训练.其次,利用预训练的主干网络将支持图片和查询图片映射到深度特征空间.最后,利用支持图片的真实掩码将支持特征分离为目标前景和背景,并借助vision transformer构造了一种自适应的元学习分类器.在PASCAL-5^(i)数据集上进行了大量的试验.结果表明:所提出模型在VGG-16、ResNet-50和ResNet-101主干网络上分别实现了47.1%、58.3%和60.4%的mIoU(即平均交并比)(1 shot),同时在5 shot设定下实现了49.6%、60.2%和62.1%的mIoU;在COCO-20^(i)数据集上实现了23.6%、30.3%和30.7%的mIoU(1 shot),同时在5 shot设定下实现了30.1%、34.7%和35.2%的mIoU.
To solve the problem of low segmentation accuracy for unknown novel classes in existing few shot semantic segmentation models,the few shot semantic segmentation algorithm based on meta-learning was proposed.The depth-separable convolutions were utilized to improve the traditional backbone network,and the encoder pre-training on the ImageNet dataset was performed.The pre-trained backbone network was used to map the support and query images into deep feature space.Using the ground truth masks of the support images,the support features were separated into object foreground and background,and the adaptive meta-learning classifier was constructed using vision transformer.The extensive experiments on the PASCAL-5^(i)dataset were completed.The results show that the proposed model achieves mIoU(mean Intersection over Union)(1 shot)of 47.1%,58.3%and 60.4%on VGG-16,ResNet-50 and ResNet-101 backbone networks,respectively,and it achieves mIoU of 49.6%,60.2%and 62.1%under the 5 shot setting.On the COCO-20^(i)dataset,mIoU(1 shot)values of 23.6%,30.3%and 30.7%are achieved with mIoU values of 30.1%,34.7%and 35.2%under the 5 shot setting.
作者
王兰忠
牟昌善
WANG Lanzhong;MU Changshan(School of Foreign Languages and Literature,Shandong University,Jinan,Shandong 250100,China;Information Center,Shandong Provincial Tax Service,State Taxation Administration,Jinan,Shandong 250002,China)
出处
《江苏大学学报(自然科学版)》
CAS
北大核心
2024年第5期574-580,620,共8页
Journal of Jiangsu University:Natural Science Edition
基金
山东省重点研发计划项目(2021RKL02001)。