期刊文献+

基于深度学习的视频描述方法研究综述 被引量:8

An overview of video captioning method base on deep learning
下载PDF
导出
摘要 随着深度学习技术在计算机视觉领域与自然语言处理领域的突破性进展,图像描述和视频描述的跨模态研究不断涌现.由于视频的时序特征以及视频内容的多样性与复杂性,视频描述相对于图像描述来说具有更大的挑战.视频描述的方法可以归纳为两类:基于模板的方法和基于编码-解码的方法.本文将着重介绍采用深度学习技术的编码-解码方法,文章首先对模型结构的发展做了分析与比较,其次对现有的方法做了归纳与总结.接着,介绍了一些比较有影响力的数据集和评测标准,最后对尚未解决的关键问题与研究难点做了总结与介绍. As a new cross-model task which connects computer vision and natural language processing,video captioning has drawn wide attention from the research because of the breakthrough of deep learning technology.Due to the temporal characteristics of video and the diversity and complexity of video content,video description is more challenging than image description.Video captioning methods can be classified into two categories:generation-based method and encoder-decoder method.This survey focuses on the method of using the encoder-decoder framework based on deep learning approaches to generate the natural language description for video sequences.Firstly,this paper analyzes the model structure and summarizes existing methods;also introduces some the different dataset used for video captioning and various evaluation parameters used for measuring the performance of different video captioning models.Finally,the key technical problems in video captioning task are analyzed and prospected.
作者 常志 赵德新 CHANG Zhi;ZHAO De-xin(School of Computer Science and Engineering,Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology,Tianjin University of Technology,Tianjin 300384,China)
机构地区 天津理工大学
出处 《天津理工大学学报》 2020年第6期17-23,共7页 Journal of Tianjin University of Technology
基金 国家自然科学基金(61202169).
关键词 深度学习 视频描述 编码-解码 deep learning video captioning encode-decode
  • 相关文献

同被引文献21

引证文献8

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部