期刊文献+

基于ViT的细粒度图像分类 被引量:1

Fine-grained visual classification based on vision transformer
下载PDF
导出
摘要 为解决细粒度图像分类任务存在类内差异性和类间相似性大的问题,提出一种基于Vision Transformer(ViT)的细粒度图像分类方法。采取ViT作为特征编码网络,获取图像的全局特征表示;设计多级区域选择模块,捕捉细微的具有可判别性的层级化信息;利用一个简单且有效的中心损失函数,缩短深层特征与相应类中心在特征空间中的距离。在图像级标签的监督下,实现端到端的训练。结果在CUB-200-2011、NABirds以及Stanford Cars数据集上分别达到90.1%、90.2%和93.7%的分类准确率,超越当前最优算法。 To address the problem of subtle intra-class similarities and big inter-class variances fronted by in the fine-grained visual classification(FGVC)task,a method based on Vision Transformer(ViT)architecture was proposed for FGVC.Speci-fically,ViT as features encoding network was used to extract image representation.Multi-regions selection module(MRSM)was designed to capture fine-grained discriminative and hierarchical information.The center loss was introduced to close the distance between depth features and class centers.Under the supervision of image-level labels,the entire network was optimized in an end-to-end manner.The results achieve 90.1%,90.2%and 93.7%classification accuracies on the CUB-200-2011,NABirds and Stanford Cars datasets,respectively,surpassing the current best algorithm.
作者 李佳盈 蒋文婷 杨林 罗铁坚 LI Jia-ying;JIANG Wen-ting;YANG Lin;LUO Tie-jian(School of Automation,Beijing Information Science and Technology University,Beijing 100854,China;Institute of Telecommunication and Navigation Satellites,China Academy of Space Technology,Beijing 100091,China;Institute 706,Second Academy of China Aerospace Science and Industry Corporation,Beijing 100039,China;School of Computer Science and Technology,University of Chinese Academy of Sciences,Beijing 100854,China)
出处 《计算机工程与设计》 北大核心 2023年第3期916-921,共6页 Computer Engineering and Design
关键词 细粒度图像分类 深度自注意力变换网络 注意力机制 中心损失 卷积神经网络 特征表示 特征空间 fine-grained visual classification transformer attention mechanism metric learning convolutional neural network feature representation feature space
  • 相关文献

参考文献1

二级参考文献2

共引文献8

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部