摘要
作为一种快速浏览和理解视频内容的方式,视频摘要技术引起了广泛的关注.本文将视频摘要任务看作是序列到序列的预测问题,设计了一种新颖的基于解码器的视觉注意力机制,并基于此提出一种有监督视频摘要算法.所提方法考虑到视频帧之间的内在关联性,利用长短时记忆网络将注意力集中在历史的解码序列,融合历史的解码信息有效地指导解码,提升模型预测的准确性.所提算法主要在TVSum和Sum Me数据集上进行了大量实验,验证了其有效性及先进性.
As a way to quickly browse and understand video content,video summarization has attracted wide attention.This paper treats video summarization as a sequence-to-sequence prediction problem and proposes a novel visual attention model based on decoder,which is further applied to supervised video summarization.The proposed method pays attention to decoding sequence by using long short-term memory network.It considers the intrinsic association between video frames,and utilizes the previous decoding sequences to effectively guide the decoding process,which improves the prediction accuracy.Extensive experiments are mainly conducted on TVSum and SumMe datasets,which demonstrate the effectiveness and superiority of the proposed method.
作者
冀中
江俊杰
Ji Zhong;Jiang Junjie(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
出处
《天津大学学报(自然科学与工程技术版)》
EI
CSCD
北大核心
2018年第10期1023-1030,共8页
Journal of Tianjin University:Science and Technology
基金
国家自然科学基金资助项目(61472273
61771329)~~
关键词
视频摘要
视觉注意力模型
编解码模型
长短时记忆网络
video summarization
visual attention model
encoder-decoder model
long short-term memory network