期刊文献+

基于全局-局部特征和自适应注意力机制的图像语义描述算法 被引量:6

Image captioning based on global-local feature and adaptive-attention
下载PDF
导出
摘要 为了探究图像底层视觉特征与高层语义概念存在的差异,提出可以确定图像关注重点、挖掘更高层语义信息以及完善描述句子的细节信息的图像语义描述算法.在图像视觉特征提取时提取输入图像的全局-局部特征作为视觉信息输入,确定不同时刻对图像的关注点,对图像细节的描述更加完善;在解码时加入注意力机制对图像特征加权输入,可以自适应选择当前时刻输出的文本单词对视觉信息与语义信息的依赖权重,有效地提高对图像语义描述的性能.实验结果表明,该方法相对于其他语义描述算法效果更有竞争力,可以更准确、更细致地识别图片中的物体,对输入图像进行更全面地描述;对于微小的物体的识别准确率更高. The image captioning algorithm was proposed in order to explore the difference of the image visual features and the upper layer semantic concept.The algorithm can determine the image focus,mine higher-level semantic information,and improve the description details.Local features were added for the image visual feature extraction,and the global-local feature of the input image was combined with the global features and local features for visual information.Then the focus of the image at different time was determined,and more details of the image were caught.The attention mechanism was added to weight the image feature during decoding,so that the dependence of the text words on the visual information and the semantic information at the current moment could be adaptively adjusted,and the performance of image captioning was effectively improved.The experimental results show that the proposed method can acquire competitive captioning results than other image captioning algorithms.The method can describe the image more accurately and more comprehensively,and the recognition accuracy of tiny objects is higher than others.
作者 赵小虎 尹良飞 赵成龙 ZHAO Xiao-hu;YIN Liang-fei;ZHAO Cheng-long(National and Local Joint Engineering Laboratory of Internet Application Technology on Mine,China University of Mining and Technology,Xuzhou 221008,China;School of Information and Control Engineering,China University of Mining and Technology,Xuzhou 221116,China)
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2020年第1期126-134,共9页 Journal of Zhejiang University:Engineering Science
基金 国家重点研发计划资助项目(2017YFC0804400)
关键词 图像语义描述 图像关注点 高层语义信息 描述句子细节 全局-局部特征提取 自适应注意力机制 image captioning image focus higher-level semantic information description detail global-local feature extraction adaptive-attention mechanism
  • 相关文献

同被引文献34

引证文献6

二级引证文献20

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部