期刊文献+

基于Transformer的小样本细粒度图像分类方法 被引量:2

Transformer-Based Few-Shot and Fine-Grained Image Classification Method
下载PDF
导出
摘要 针对小样本细粒度图像分类任务中存在的相似性度量单一以及细粒度特征提取效果不佳的问题,提出了一种基于Transformer的小样本细粒度图像分类方法,克服了小样本学习在细粒度图像分类中由于样本数量较少从而分类效果较差的问题。构建以多轴注意力模块与卷积算子为基本组件的新模块CBG Transformer Block,通过该模块的重复堆叠提高了网络的特征提取能力;采用关系网络和余弦网络组成的双相似度模块进行相似性度量,避免了在训练数据量较小的情况下单一度量造成的相似性偏差;通过计算两个相似度得分的平均值得出最终预测结果。实验结果表明,提出的方法在CUB-200-2011、Stanford Cars和Stanford Dogs三个公开细粒度图像数据集上的5-way5-shot任务分类精度分别达到了82.70%、74.22%和69.68%,可见在小样本细粒度图像分类任务中取得了优异效果。 To address the problems of single similarity measure and poor fine-grained feature extraction in few-shot and fine-grained image classification tasks,a Transformer-based few-shot and fine-grained image classification method is proposed in this paper to overcome the problem of few-shot learning in fine-grained image classification due to the small number of samples and thus poor classification results.Firstly,it constructs a new module CBG Transformer Block with multi-axis attention module and convolution operator as the basic components,and improves the feature extraction ability of the network by repeated stacking of the module.Secondly,it adopts a dual similarity module consisting of relational network and cosine network for similarity measurement,which avoids the similarity bias caused by a single measure in the case of small training data.Finally,the final prediction results are obtained by calculating the average of the two similarity scores.The experimental results show that the proposed method respectively achieves 82.70%,74.22% and 69.68% classification accuracy for the 5-way 5-shot task on three publicly available fine-grained image datasets,CUB-200-2011,Stanford Cars and Stanford Dogs.It can be seen that the proposed method has achieved excellent results in few-shot and fine-grained image classification tasks.
作者 陆妍 王阳萍 王文润 LU Yan;WANG Yangping;WANG Wenrun(School of Electronic and Information Engineering,Lanzhou Jiaotong University,Lanzhou 730070,China;National Virtual Simulation Experimental Teaching Center for Rail Transit Information and Control,Lanzhou 730070,China)
出处 《计算机工程与应用》 CSCD 北大核心 2023年第23期219-227,共9页 Computer Engineering and Applications
基金 国家自然科学基金(62067006) 教育部人文社会科学研究项目(21YJC880085) 中央引导地方科技发展专项资金项目(2020) 甘肃省高等学校产业支撑计划项目(2020C-19) 甘肃省知识产权计划项目(21ZSCQ013) 甘肃省重点人才项目(2022年) 甘肃省科技计划项目(21JR7RA713,21YF5FA009) 中国高校产学研创新基金北创助教项目(2021BCB02001)。
关键词 细粒度图像分类 小样本学习 多轴注意力 CBG Transformer Block 双相似度 fine-grained image classification few-shot learning multi-axis attention conv-block-grid(CBG)Transformer Block dual similarity
  • 相关文献

参考文献1

二级参考文献5

共引文献7

同被引文献38

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部