摘要
为了改进在细粒度图像分类过程中类别差异难以提取的问题,提出了一种基于Transformer双线性网络的细粒度网络分类优化方法(BT-Net).首先,将输入图像通过不同卷积处理成不同长度的二维向量;然后,构建重复次数不同的编码器;最后,双网络分支将图像表示为来自两个Transformer的特征集合,得到更加丰富的互补特征信息,从而提高细粒度分类的精度.实验结果表明:在CUB-200-2011,Cars196和Stanford Dogs数据集中,BT-Net分类准确率分别为89.4%,92.5%,94.8%,优于已有的双线性卷积神经网络.
In order to improve the difficulty of extracting class differences in fine-grained image classification,a fine-grained network classification optimization method based on transformer bilinear network(BT-Net)was proposed.First,the input image was processed into two-dimensional vectors of different lengths through different convolutions,and then an encoder with different repetition times was constructed.Finally,the double network branch represented the image as a feature set from two transformers to obtain more abundant complementary feature information,thus improving the accuracy of fine-grained classification.The experimental results show that the classification accuracy of BT net in cub-200-2011,cars196 and Stanford dogs data sets is 89.4%,92.5%and 94.8%,respectively,which is better than the existing bilinear convolution neural network.
作者
向旭宇
刘亚捷
曾彬
谭云
XIANG Xuyu;LIU Yajie;ZHENG Bin;TAN Yun(College of Computer Science and Information Technology,Central South University of Forestry and Technology,Changsha 410000,China;College of Computer Science and Engineering,Changsha College,Changsha 410000,China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2024年第2期84-89,共6页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
国家自然科学基金青年项目(62002392)
湖南省自然科学基金资助项目(2022JJ31019,2020JJ4141)
长沙市物联网安全态势感知与风险评估技术研发创新平台建设项目.