期刊文献+

结合沙漏注意力与渐进式混合Transformer的图像分类方法

Hourglass attention and progressive hybrid Transformer for image classification
下载PDF
导出
摘要 Transformer在图像分类任务中具有广泛应用,但在小数据集分类任务中,Transformer受到数据量较少、模型参数量过大等因素的影响,导致分类精度低、收敛速度缓慢。本文提出了一种融合沙漏注意力的渐进式混合Transformer模型。首先,通过下-上采样的沙漏自注意力建模全局特征关系,利用上采样补充下采样操作丢失的信息,同时采用可学习温度参数和负对角掩码锐化注意力的分数分布,避免因层数过多产生过度平滑的现象;其次,设计渐进式下采样模块获得细粒度多尺度特征图,有效捕获低维特征信息;最后,使用混合架构,在顶层阶段使用设计的沙漏注意力,底层阶段使用池化层替代注意力模块,并引入带有深度卷积的层归一化,增加网络局部性。所提方法在T-ImageNet、CIFAR10、CIFAR100、SVHN数据集上进行实验,分类精度可以达到97.42%,计算量和参数量分别为3.41G和25M。实验结果表明,与对比算法相比,该方法的分类精度有明显提升,计算量和参数量有明显降低,提高了Transformer模型在小数据集上的性能表现。 Transformer has a wide range of applications in image classification tasks,but in small dataset classification tasks,Transformer is affected by factors such as small amount of data and excessive amount of model parameters,which leads to low classification accuracy and slow convergence speed.Therefore,a progressive hybrid transformer model with hourglass attention is proposed.Firstly,the global feature relationships are modeled by the hourglass self-attention with down-up sampling,and up-sampling is used to supplement the information lost by the down-sampling operation,while the learning temperature parameters and negative diagonal mask are used to sharpen the fractional distribution of the attention to avoid excessive smoothing due to the excessive number of layers.Secondly,progressive down-sampling modules are designed to obtain the fine-grained multi-scale feature maps,which can effectively capture the low-dimensional feature information.Finally,a hybrid architecture is used,where the designed hourglass attention is used in the top stage,the pooling layer is used in the bottom stage instead of the attention module,and layer normalization with deep convolution is introduced to increase network locality.The proposed method is experimented on T-ImageNet,CIFAR10,CIFAR100,and SVHN datasets,the classification accuracy can reach 97.42%,and the computation and parameters are 3.41G and 25M.The experimental results show that compared with the comparison algorithms,the classification accuracy of the proposed method is significantly improved,with a significant reduction in computation and parameters,which improves the performance of Transformer model on small datasets.
作者 彭晏飞 崔芸 陈坤 李泳欣 PENG Yanfei;CUI Yun;CHEN Kun;LI Yongxin(School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125105,China)
出处 《液晶与显示》 CAS CSCD 北大核心 2024年第9期1223-1232,共10页 Chinese Journal of Liquid Crystals and Displays
基金 国家自然科学基金(No.61772249) 辽宁省高等学校基本科研项目(No.LJKZ0358)。
关键词 小数据集图像分类 TRANSFORMER 沙漏注意力 多尺度特征 混合架构 image classification for small dataset transformer hourglass attention multi scale features hybrid architecture
  • 相关文献

参考文献4

二级参考文献76

  • 1王飞,李定主.模式识别中贝叶斯决策理论的研究[J].科技情报开发与经济,2007,17(7):165-166. 被引量:8
  • 2Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60 (2) 91 110.
  • 3Dalai N, Triggs B. Histograms of oriented gradients for human detection[C]//Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society Conference on. San Diego, USA: IEEE, 2005, 1 886-893.
  • 4Hinton G E, Salakhutdinov R R. Reducing the dimensionality of data with neural networks[J]. Science, 2006, 313(5786) : 504-507.
  • 5Hubel D H, Wiesel T N. Receptive fields, binocular interaction and functional architecture in the catrs visual cortex[J]. The Journal of Physiology, 1962, 160(1): 106-154.
  • 6Fukushima K, Miyake S. Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in posi- tion[J]. Pattern Recognition, 1982, 15(6): 455-469.
  • 7Ruck D W, Rogers S K, Kabrisky M. Feature selection using a multilayer perceptron[J]. Journal of Neural Network Com- puting, 1990, 2(2): 40-48.
  • 8Rumelhart D E, Hinton G E, Williams R J. Learning representations by back-propagating errors[J]. Nature, 1986,3231 533 538.
  • 9LeCun Y, Denker J S, Henderson D, et al. Handwritten digit recognition with a back-propagation network[C]//Advances in Neural Information Processing Systems. Colorado, USA Is. n. ], 1990: 396-404.
  • 10LeCun Y, Cortes C. MNIST handwritten digit database[EB/OL], http//yann, lecun, com/exdb/mnist, 2010.

共引文献651

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部