摘要
在计算机视觉领域中,大多数的视频表示方法都是有监督的,需要大量带有标签的训练视频集,但标注大量视频数据会花费极大的人力和物力.为了解决这个问题,提出了一种基于深度神经网络的无监督视频表示方法.该方法利用改进的稠密轨迹(iDT)算法提取的视频块交替地训练深度卷积神经网络和特征聚类,得到可提取视频特征的深度卷积神经网络模型;通过视频的中层语义特征,实现了无监督视频表示.该模型在HMDB 51行为识别数据库和CCV事件检测数据库上分别进行了动作识别和事件检测的实验,获得了62.6%的识别率和43.6%的检测率,证明了本文方法的有效性.
Most video representation methods arc supervised in the field of computer vision,requi-ring large amounts of labeled training video sets which is expensive to scale up to rapidly growing data. To solve this problem,this paper proposes an unsupervised video representation method u-sing deep convolutional neural network. The improved dense trajectory (iDT) is utilized to extract the video blocks which alternately train the convolutional neural network and clusters. The deep convolutional neural network model is trained by iteratively algorithm to get the unsu-pervised video representations. The proposed model is applied to extract features in HMDB 51 and CCV datasets for tasks of motion recognition and event detection respectively. In the experi-ments ,a 62.6% mean accuracy and a 43.6% mean average prevision (mAP) are obtained respec-tively which proves the effectiveness of the proposed method.
作者
吴心筱
伍堃
WU Xinxiao;WU Kun(Beijing Laboratory of Intelligent Information T'cchnology,Beijing Institute of Technology, Beijing 100081,China.)
出处
《北京交通大学学报》
CAS
CSCD
北大核心
2017年第6期8-12,共5页
JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金
国家自然科学基金(61673062
61472038)~~
关键词
无监督学习
卷积神经网络
视频表示
unsupervised learning
convolution neural networks
video representation