期刊文献+

Video summarization with a graph convolutional attention network 被引量:2

原文传递
导出
摘要 Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network(GCAN) for video summarization. GCAN consists of two parts, embedding learning and context fusion, where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer graph convolutional network to reveal the intrinsic structure of frame samples. The context fusion part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, Sum Me and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings.
出处 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2021年第6期902-913,共12页 信息与电子工程前沿(英文版)
基金 Project supported by the National Natural Science Foundation of China (Nos. 61872122 and 61502131) the Zhejiang Provincial Natural Science Foundation of China (No. LY18F020015) the Open Pro ject Program of the State Key Lab of CAD&CG,China (No. 1802) the Zhejiang Provincial Key Research and Development Program,China (No. 2020C01067)。
  • 相关文献

参考文献1

共引文献2

同被引文献13

引证文献2

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部