摘要
各类注意力机制模块被表明嵌入深度学习网络中能更充分地提取特征信息.这些机制在通道或空间特征提取方面、两者交互方面仍有优化的空间.本文提出了一种复合型三分支注意力机制(Triplet Coordinate Attention,TCA),该机制基于三分支注意力机制(Triplet Attention,简称TA),引入坐标注意力机制(Coordinate Attention,简称CA)更好地提取空间交互注意力和通道空间交互注意力.TCA机制的参数量和运算量均不大,能够嵌入各种主干网络中.本文在图像分类数据集miniImageNet、目标检测数据集VOC2007和2012上做了大量对比实验,结果表明,网络模型嵌入TCA能进一步提升精度.特别是,较CA和TA在MobileNetV2图像分类任务中Top-1准确率分别提高了1.01%,1.62%;在MobileNetV3+SSDLite目标检测任务上AP50精度分别提高0.5%、2.0%.
It has been shown that attention mechanism modules can extract feature information in more depth and width if they are embedded in deep learning networks.The attention mechanism still have rooms for optimization in fields of channel attention mechanism,spatial attention mechanism,and the interaction models of channel and spatial attention mechanisms.This paper proposes a compound three-branch attention mechanism(TCA),which integrate the coordinate attention mechanism(CA)into the three-branch attention mechanism(TA)to better extract spatial interactive attention and channel spatial interactive attention.The proposed TCA mechanism can be embedded in various backbone networks easily with not too many parameters and computations.This paper has done a lot of comparative experiments on miniImageNet(an image classification dataset),VOC2007 and 2012(object detectiondatasets).The results show that network models embedding TCA can further improve the accuracy.Especially,compared with CA and TA,the accuracy of Top-1 in MobileNetV2(image classification task)is improved by 1.01%and 1.62%,respectively;On MobileNetV3+SSDLite(object detection tasks),the AP50 accuracy is improved by 0.5%and 2.0%,respectively.
作者
何强
杨云飞
冯松
HE Qiang;YANG Yunfei;FENG Song(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Yunnan Key Laboratory of Computer Technology Application,Kunming 650500,China)
出处
《小型微型计算机系统》
CSCD
北大核心
2024年第10期2485-2491,共7页
Journal of Chinese Computer Systems
基金
国家自然科学基金项目(11763004,11803085,U1931107)资助.
关键词
注意力机制
交互注意力
图像分类
目标检测
attention mechanism
interactive attention
image classification
object detection