期刊文献+

基于改进的Transformer_decoder的增强图像描述

Enhanced Image Caption Based on Improved Transformer_decoder
下载PDF
导出
摘要 Transformer的解码器(Transformer_decoder)模型已被广泛应用于图像描述任务中,其中自注意力机制(Self Attention)通过捕获细粒度的特征来实现更深层次的图像理解。本文对Self Attention机制进行2方面改进,包括视觉增强注意力机制(Vision-Boosted Attention,VBA)和相对位置注意力机制(Relative-Position Attention,RPA)。视觉增强注意力机制为Transformer_decoder添加VBA层,将视觉特征作为辅助信息引入Self Attention模型中,指导解码器模型生成与图像内容更匹配的描述语义。相对位置注意力机制在Self Attention的基础上,引入可训练的相对位置参数,为输入序列添加词与词之间的相对位置关系。基于COCO2014进行实验,结果表明VBA和RPA这2种注意力机制对图像描述任务都有一定改进,且2种注意力机制相结合的解码器模型有更好的语义表述效果。 Transformer′s decoder model(Transformer_decoder)has been widely used in image caption tasks. Self Attention captures fine-grained features to achieve deeper image understanding. This article makes two improvements to the Self Attention,including Vision-Boosted Attention(VBA)and Relative-Position Attention(RPA). Vision-Boosted Attention adds a VBA layer to Transformer_decoder, and introduces visual features as auxiliary information into the attention model, which can be used to guide the decoder model to generate more matching description semantics with the image content. On the basis of Self Attention,Relative-Position Attention introduces trainable relative position parameters to add the relative position relationship between words to the input sequence. Based on COCO2014 experiments, the results show that the two attention mechanisms of VBA and RPA have improved image caption tasks to a certain extent,and the decoder model combining the two attention mechanisms has better semantic expression effects.
作者 林椹尠 屈嘉欣 罗亮 LIN Zhen-xian;QU Jia-xin;LUO Liang(School of Communication and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China)
出处 《计算机与现代化》 2023年第1期7-12,共6页 Computer and Modernization
基金 国家青年基金资助项目(12102341) 陕西省教育厅项目(21JK0904) 陕西省自然科学基础研究计划项目(2020JM-580)。
关键词 图像描述 Transformer模型 Self Attention机制 相对位置注意力机制 视觉增强注意力机制 image caption Transformer model Self Attention mechanism relative-position attention vision-boosted attention
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部