摘要
针对解决图像描述生成中对浅层图像特征利用不充分、图像目标间关系提取不足的问题,提出一种基于注意力图像特征提取的图像描述生成算法.通过语言模型上下文信息对不同深度图像特征进行自适应注意力权重分配,使带有注意力的图像特征参与指导图像描述生成,提升了图像描述生成的效果.在MSCOCO测试集中所提算法的BLEU-1和CIDEr得分分别达到0.752和0.934,从而验证了所提算法的有效性.
To solve the problem of the lack of use of shallow image features in image captions and insufficient extraction of image objects,an image caption generation algorithm based on attention image feature extraction is proposed.Through context information of a language model,adaptive attention weight assignment is performed on different depth image features to ensure that the attention-grabbing image features guide the image caption generation, thereby improving the image caption effect.In the MSCOCO test set,the BLEU-1 and CIDEr scores of the proposed algorithm reached 0. 752 and 0. 934,respectively,thus verifying the effectiveness of the proposed method.
作者
李金轩
杜军平
周南
LI Jinxuan;DU Junping;ZHOU Nan(Beijing Key Laboratory of Intelligent Telecommunication Software and Multimedia,School of Computer Science,Beijing University of Posts and Telecommunications,Beijing 100876)
出处
《南京信息工程大学学报(自然科学版)》
CAS
2019年第3期295-301,共7页
Journal of Nanjing University of Information Science & Technology(Natural Science Edition)
基金
国家自然科学基金(61772083,61532006,61877006,61802028)
广西科技重大专项(桂科AA18118054)
关键词
注意力机制
图像描述
长短期记忆网络
图像特征提取
attention mechanism
image caption
long and short term memory network
image feature extraction