摘要
在现实的生活视频中,检测人体动作以及分类时,常常会出现视频背景复杂、模糊,以及因人多导致多种动作行为同时出现的问题,而致使检测和判别某种行为结果出现偏差。因此文中针对2D CNN对单个帧进行提取特征却没有包含实际视频中连续多帧之间编码的运动信息,提出一种基于三维卷积神经网络识别方法,旨在更好地捕获视频连续帧中隐藏的时间和空间信息。实验结果表明,与现有的几类方法相比,所提方法识别率得到较为明显的提升,验证了该方法的有效性和鲁棒性。
In real⁃life video detection and classification,the video background is complex and fuzzy,as well as many people lead to a variety of action behavior problems at the same time,which causes the deviation of detection and discrimination of a certain behavior results.In allusion to the problem that feature extraction is conducted by 2D CNN from a single frame,but the motion information encoded between consecutive frames is not included,a neural network recognition method based on 3D convolution is proposed to better capture the hidden time and space information in consecutive frames of video.The experimental results show that,in comparison with the existing methods,the recognition rate of this method is significantly improved,and the effectiveness and robustness of the proposed method are verified.
作者
朱云鹏
黄希
黄嘉兴
ZHU Yunpeng;HUANG Xi;HUANG Jiaxing(School of Mechanical Engineering,Nantong University,Nantong 226019,China)
出处
《现代电子技术》
北大核心
2020年第18期150-152,156,共4页
Modern Electronics Technique
基金
国家自然科学基金青年基金项目(51405246)
南通市科技局项目(CP12014001,MS12017017-7)。
关键词
人体动作识别
三维卷积神经网络
特征提取
模型训练
深度学习
实验对比
human action recognition
3D convolution neural network
feature extraction
model training
deep learning
experimental comparison