摘要
时序行为检测是指在一段未分割的长视频中,检测出其中包含的若干行为片段的起止时间和类别.针对该项任务,提出基于双流卷积神经网络的行为检测模型.首先使用双流卷积神经网络提取视频的特征序列,然后使用TAG(Temporal Actionness Grouping)生成行为提议,为了构建高质量的行为提议,将行为提议送入边界回归网络中修正边界,使之更为贴近真实数据,再将行为提议扩展为含有上下文信息的三段式特征设计,最后使用多层感知机对行为进行识别.实验结果表明,本算法在THUMOS2014数据集和ActivityNet v1.3数据集获得较好的识别率.
Given a long,untrimmed video consisting of multiple action instances and complex background contents,temporal action detection needs not only to recognize their action categories,but also to localize the start time and end time of each instance.To this end,a temporal action detection network based on two-stream convolutional networks is proposed.First,the two-stream convolutional networks is used to extract the feature sequence of the video,and then TAG (Temporal Actionness Grouping) is used to generate the proposal.In order to construct high-quality proposals,the proposal is feed to the boundary regression network to correct the boundary and make it closer to the ground truth,then extend the proposal to a three-segment feature design with context information,and finally use a multi-layer perception to identify behavior.The experimental results show that the proposed algorithm achieves a great mAP in the THUMOS 2014 dataset and the ActivityNet v1.3 dataset.
作者
刘云
张堃
王传旭
LIU Yun;ZHANG Kun;WANG Chuan-Xu(Information Science and Technology Academy, Qingdao University of Science and Technology, Qingdao 266061, China)
出处
《计算机系统应用》
2019年第7期234-239,共6页
Computer Systems & Applications
基金
国家自然科学基金(61472196,61672305)~~
关键词
行为识别
双流卷积神经网络
深度学习
时序行为检测
human action recognition
two-stream convolutional networks
deep learning
temporal action localization