摘要
针对小样本图像分类任务中基于卷积神经网络的特征提取模块难以捕获远程语义信息和边特征相似度度量单一的问题,提出一种基于Swin Transformer的图神经网络小样本图像分类算法。首先,利用Swin Transformer网络来提取图像特征,并将该特征作为节点特征输入图神经网络;然后,通过增加额外度量的方式改进了边特征相似度量模块,形成双度量模块以计算节点特征之间的相似度,将得到的相似度作为边特征输入图神经网络;最后,交替更新节点和边特征来获取图像标签的信息。在Stanford Dogs、Stanford Cars和CUB-200-2011三个数据集上,所提方法对5-way 1-shot任务的分类准确率分别达85.21%、91.10%和91.08%,在小样本图像分类任务中取得了显著的效果。
In few-shot image classification tasks,capturing remote semantic information in feature extraction modules based on convolutional neural network and single measure of edge-feature similarity are challenging.Therefore,in this study,we present a few-shot image classification method utilizing a graph neural network based on Swin Transformer.First,the Swin Transformer is used to extract image features,which are utilized as node features in the graph neural network.Next,the edge-feature similarity measurement module is improved by adding additional metrics,thus forming a dual-measurement module to calculate the similarity between the node features.The obtained similarity is used as the edgefeature input of the graph neural network.Finally,the nodes and edges of the graph neural network are alternately updated to predict image class labels.The classification accuracy of our proposed method for a 5-way 1-shot task on Stanford Dogs,Stanford Cars,and CUB-200-2011 datasets is calculated as 85.21%,91.10%,and 91.08%,respectively,thereby achieving significant results in few-shot image classification.
作者
王凯
任劼
章为川
Wang Kai;Ren Jie;Zhang Weichuan(School of Electronics and Information,Xi’an Polytechnic University,Xi’an 710048,Shaanxi,China;Institute for Integrated and Intelligent Systems,Graiffith University,Brisbane 4702,Australia)
出处
《激光与光电子学进展》
CSCD
北大核心
2024年第12期371-379,共9页
Laser & Optoelectronics Progress
基金
陕西省自然科学基础研究计划(2022JM-394)。