关系挖掘驱动的视频描述自动生成

Video description based on relationship feature embedding

下载PDF

导出

摘要视频的自动描述任务是计算机视觉领域的一个热点问题.视频描述语句的生成过程需要自然语言处理的知识,并且能够满足输入(视频帧序列)和输出(文本词序列)的长度可变.为此本文结合了最近机器翻译领域取得的进展,设计了基于编码-解码框架的双层LSTM模型.在实验过程中,本文基于构建深度学习框架时重要的表示学习思想,利用卷积神经网络(CNN)提取视频帧的特征向量作为序列转换模型的输入,并比较了不同特征提取方法下对双层LSTM视频描述模型的影响.实验结果表明,本文的模型具有学习序列知识并转化为文本表示的能力. Video description has received increased interest in the field of computer vision.The process of generating video descriptions needs the technology of natural language processing,and the capacity to allow both the lengths of input（ sequence of video frames） and output（ sequence of description words） to be variable.To this end,this paper uses the recent advances in machine translation,and designs a two-layer LSTM（ Long Short-Term Memory） model based on the encoder-decoder architecture.Since the deep neural network can learn appropriate representation of input data,we extract the feature vectors of the video frames by convolution neural network（ CNN） and take them as the input sequence of the LSTM model. Finally,we compare the influences of different feature extraction methods on the LSTM video description model. The results show that the model in this paper is able to learn to transform sequence of knowledge representation to natural language.

作者黄毅鲍秉坤徐常胜

机构地区中国科学院自动化研究所模式识别国家重点实验室中国科学院大学

出处《南京信息工程大学学报（自然科学版）》 CAS 2017年第6期642-649,共8页 Journal of Nanjing University of Information Science & Technology（Natural Science Edition）

基金国家自然科学基金(61572503 61432019) 北京市自然科学基金(4152053)

关键词视频描述 LSTM模型表示学习特征嵌入 video description LSTM model representation learning feature embedding

分类号 TP391.41 [自动化与计算机技术—计算机应用技术] TP183 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

1王蕾.基于深度学习的声纳智能化显控设计方法[J].舰船电子工程,2017,37(10):87-90.
2邓三鸿,傅余洋子,王昊.基于LSTM模型的中文图书多标签分类研究[J].数据分析与知识发现,2017,1(7):52-60. 被引量：27
3李盛秋,赵妍妍,秦兵,刘挺.基于LSTM网络的评价对象和评价词抽取[J].智能计算机与应用,2017,7(5):95-97. 被引量：1
4鲁旷.一点希望[J].思想政治工作研究,1987,0(8):40-40.
5朱木易洁,鲍秉坤,徐常胜.知识图谱发展与构建的研究进展[J].南京信息工程大学学报（自然科学版）,2017,9(6):575-582. 被引量：46
6张威威,李瑞敏,谢中教.基于深度学习的城市道路旅行时间预测[J].系统仿真学报,2017,29(10):2309-2315. 被引量：24
7Zhen LI,Yuqing WANG,Tian ZHI,Tianshi CHEN.A survey of neural network accelerators[J].Frontiers of Computer Science,2017,11(5):746-761. 被引量：4
8戎林海,戎佩珏.美即是真,真即是美——屠岸先生访谈录[J].常州工学院学报（社会科学版）,2017,35(5):1-5. 被引量：2
9屈文生,丁沁晨.裁判类法律术语英译研究[J].中国翻译,2017,38(6):92-99. 被引量：8
10陈枭.母语生成过程对英语教和学的启示[J].语言与文化研究,2010(2):114-118.

南京信息工程大学学报（自然科学版）

2017年第6期

浏览历史

内容加载中请稍等...

关系挖掘驱动的视频描述自动生成

相关作者

相关机构

相关主题

浏览历史