期刊文献+

基于解码器注意力机制的视频摘要 被引量:7

Video Summarization Based on Decoder Attention Mechanism
下载PDF
导出
摘要 作为一种快速浏览和理解视频内容的方式,视频摘要技术引起了广泛的关注.本文将视频摘要任务看作是序列到序列的预测问题,设计了一种新颖的基于解码器的视觉注意力机制,并基于此提出一种有监督视频摘要算法.所提方法考虑到视频帧之间的内在关联性,利用长短时记忆网络将注意力集中在历史的解码序列,融合历史的解码信息有效地指导解码,提升模型预测的准确性.所提算法主要在TVSum和Sum Me数据集上进行了大量实验,验证了其有效性及先进性. As a way to quickly browse and understand video content,video summarization has attracted wide attention.This paper treats video summarization as a sequence-to-sequence prediction problem and proposes a novel visual attention model based on decoder,which is further applied to supervised video summarization.The proposed method pays attention to decoding sequence by using long short-term memory network.It considers the intrinsic association between video frames,and utilizes the previous decoding sequences to effectively guide the decoding process,which improves the prediction accuracy.Extensive experiments are mainly conducted on TVSum and SumMe datasets,which demonstrate the effectiveness and superiority of the proposed method.
作者 冀中 江俊杰 Ji Zhong;Jiang Junjie(School of Electrical and Information Engineering,Tianjin University,Tianjin 300072,China)
出处 《天津大学学报(自然科学与工程技术版)》 EI CSCD 北大核心 2018年第10期1023-1030,共8页 Journal of Tianjin University:Science and Technology
基金 国家自然科学基金资助项目(61472273 61771329)~~
关键词 视频摘要 视觉注意力模型 编解码模型 长短时记忆网络 video summarization visual attention model encoder-decoder model long short-term memory network
  • 相关文献

参考文献2

二级参考文献52

  • 1Maybury M T. Broadcast news understanding and navigation [ C ]//Proceedings of the Fifteenth Conference on Innovative Ap- plications of Artificial Intelligence. Trier, German: DBLP,2003 : 117-122.
  • 2Pfeiffer S, Lienhart R, Ktthne G, et al. The MoCA project. [ M ]//Informatik'98. Berlin, Heidelberg: Springer, 1998 : 329- 338.
  • 3Chang S F, Chen W, Meng H J, et al. VideoQ: an automated content based video search system using visual cues [ C ]//Pro- ceedings of the 5th ACM International Conference on Multimedia. New York, USA:ACM, 1997: 313-324.
  • 4Snoek C G M, Worring M. Time interval maximum entropy based event indexing in soccer [ C ]//Proceedings of IEEE Internation- al Conference on Multimedia and Expo. Washington DC, USA: IEEE, 2003:481-484.
  • 5Uchihashi S, Foote J, Girgensohn A, et al. Video manga: gener- ating semantieally meaningful video summaries [ C ]//Proceedings of the seventh ACM International Conference on Multimedia ( Part 1). New York, USA:ACM, 1999: 383-392.
  • 6Zhuang Y, Rui Y, Huang T S, et al. Adaptive key frame extrac- tion using unsupervised clustering [ C ]// Proceedings of Interna- tional Conference on Image Processing. Washington DC, USA: IEEE, 1998, 1:866-870. [DOI:10. 1109/ICIP. 1998.723655].
  • 7Almeida J, Torres R D S, Leite N J. Rapid video summarization on compressed video [ C ]// IEEE International Symposium on Multimedia. Washington DC, USA: IEEE, 2010: 113-120. [ DOI : 10. 1109/ISM. 2010. 25 ].
  • 8Coldefy F, Bouthemy P. Unsupervised soccer video abstraction based on pitch, dominant color and camera motion analysis [ C ]//Proceedings of the 12th Annual ACM International Confer- ence on Multimedia. New York, USA : ACM, 2004 : 268-271.
  • 9Wolf W. Key frame selection by motion analysis [ C ]//Proceed- ings of IEEE International Conference on Acoustics, Speech, and Signal Processing. Washington DC, USA : IEEE, 1996, 2 : 1228- 1231. [DOI: 10. 1109/ICASSP. 1996. 543588 ].
  • 10Chan W S, Au O C, Chong T S. Key frame selection by macrob- lock type and motion vector analysis [ C ]//Proceedings of Inter- national Conference on Multimedia and Expo. Washington DC, USA: IEEE, 2004, 1: 575-578. [DOI: 10.1109/ICME. 2004. 1394257 ].

共引文献35

同被引文献38

引证文献7

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部