基于3D-SVD的时空行为定位算法

Spatio-Temporal Action Localization Algorithm Based on 3D-SVD

下载PDF

导出

摘要随着摄像头的普及,基于人工智能的行为分析技术在智能视频领域扮演着越来越重要的角色.现有的算法大多采用光流网络或者3D网络来获取行为的时间信息,但是光流网络和一般的3D网络计算量大,在同时进行分类和定位这两项任务时计算效率低.针对这一问题,本文构建了一个能够进行空间定位和分类的双流框架,在3D网络分支中采用SVD的思想分解3D卷积核以减少3D网络的参数,并利用动态规划算法高效的搜索最佳行为管道,在训练的过程中采用mixup算法对数据集进行扩充,增强训练的效果.最后,在UCF101-24和J-HMDB-21这两个被广泛使用的行为定位数据集上进行了实验验证,相比于基线算法,两个数据集的Frame-mAP分别提高了7.1%和4.8%,其中, J-HMDB-21在不同IOU下的Video-mAP分别提高了5.2%和4.8%.实验结果表明,本文提出的算法能有效提高行为定位能力,与其它算法相比获得了较好的结果. With the popularity of video surveillance, action analysis technology based on artificial intelligence is playing an increasingly important role in the field of intelligent surveillance. Most of the existing algorithms depend on an optical flow network or a 3 D network to obtain the time information of actions. However, the optical flow network and the general 3 D network require a large amount of computation, and the computational efficiency is low when classification and localization are carried out simultaneously. To solve this problem, this study builds a dualflow framework capable of spatial localization and classification and follows the idea of SVD to decompose the 3 D convolution kernel in the branch of the 3 D network, thus reducing the 3 D network parameters. In addition, the dynamic programming algorithm is employed to efficiently search the optimal action tubes, and the mixup algorithm is used to expand the data sets during training, thereby enhancing the training results. Finally, experimental verification is carried out on UCF101-24 and J-HMDB-21, two widely used data sets for action localization. Compared with the baseline algorithm, the Frame-mAP of the two data sets is improved by 7.1% and 4.8%, and the Video-mAP of J-HMDB-21 under different IoUs is enhanced by5.2% and 4.8%. Experimental results show that the proposed algorithm can substantially improve the ability of action localization, with better results compared with other algorithms.

作者王紫烟张立华翟鹏杜洋涛 WANG Zi-Yan;ZHANG Li-Hua;ZHAI Peng;DU Yang-Tao(Institute of AI and Robotics,Fudan University,Shanghai 200433,China;Ji Hua Laboratory,Foshan 528200,China;Engineering Research Center of AI and Robotics,Ministry of Education,Shanghai 200433,China;Engineering Research Center of AI and Unmanned Vehicle Systems of Jilin Province,Changchun 130012,China;Shanghai Engineering Research Center of AI and Robotics,Shanghai 200433,China)

机构地区复旦大学智能机器人研究院季华实验室智能机器人教育部工程研究中心吉林省人工智能与无人系统工程研究中心上海智能机器人工程技术研究中心

出处《计算机系统应用》 2021年第10期109-117,共9页 Computer Systems & Applications

基金上海市科委项目(19511132000)。

关键词行为定位 SVD 数据增强行为管道 action localization SVD data augmentation action tubes

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1方娅南.试析抖音短视频在城市形象传播中的作用[J].传播力研究,2021,5(29):70-71.
2倪斌,陆晓蕾,童逸琦,马涛,曾志贤.胶囊神经网络在期刊文本分类中的应用[J].南京大学学报（自然科学版）,2021,57(5):750-756. 被引量：4
3梁嘉晖,王永辉,陆超,李婷婷.变电站继电保护运检防误技术[J].科技资讯,2021,19(28):40-42. 被引量：2
4无.2021年前三季度互联网和相关服务业运行情况[J].互联网天地,2021(11):10-11.
5于建华,常仪.短视频谣言的传播与治理策略[J].新闻文化建设,2021(20):155-156.
6古莘.新媒体环境下反网络洗稿的途径探索[J].传媒论坛,2021,4(21):27-29. 被引量：1
7王宇凌.被消费的童年:短视频领域中“网红儿童”现象分析[J].视听,2021(12):149-151. 被引量：1
8黄海华.新《行政处罚法》制度创新的理论解析[J].行政法学研究,2021(6):3-15. 被引量：42
9崔硕,覃少华,谢志斌,张家豪,卞圣强.基于图神经网络的具有依赖关系任务的计算卸载方法[J].计算机测量与控制,2021,29(11):189-195.
10姚运锋.从传统报纸的书评到视频读书节目[J].采写编,2021(12):25-26. 被引量：1

计算机系统应用

2021年第10期

浏览历史

内容加载中请稍等...

基于3D-SVD的时空行为定位算法

相关作者

相关机构

相关主题

浏览历史