期刊文献+

一种基于帧序列特征的三流网络人体行为识别方法

A frame sequence feature based method on human action recognition
下载PDF
导出
摘要 随着计算机科学和深度学习技术的发展,人体行为识别研究逐渐成为计算机视觉的一个重要课题。目前主流的双流网络模型无法做到在提取图像和运动特征的同时提取视频的帧间序列特征,当局部序列特征与长短时运动特征发生时空交互时,双流网络模型鲁棒性严重降低。针对于此,提出了一种基于视频序列特征的三流网络人体行为识别方法。通过预处理将视频的稠密光流帧输入时间网络,RGB帧输入空间网络和帧序列特征提取网络,同时对三个网络进行预训练。网络输出其对应的特征后使用权重相加的融合方法进行特征融合,最后采用多层感知机得到行为分类结果。将该方法分别在UCF11、UCF50和HMDB51数据集进行实验,得到行为分类准确率分别为99.17%、97.40%和96.88%。与传统的双流网络方法相比,该方法有效综合了行为的空间信息,时间信息和帧序列信息,识别准确率得到较大提升,具有更强的泛化能力。 Human action recognition research has evolved in tandem with the discipline of computer science and contemporary techniques of deep learning over time,which caused it is one of the most promising research directions in computer vision.In recent years the traditional two-stream network model is ineffective in extracting the interframe sequence features of the video simultaneously with the image and motion features.Resulting a decrease in two-stream model robustness when local sequence information and long-term motion information interact.Through pre-processing the dense optical flow frames of videos are entered into a temporal network,RGB frames into a spatial network and a frame sequence feature extracting network,all three networks are concurrently pretrained.At the conclusion of training,the operation of feature extraction is executed,the features are incorporated with the parallel fusion algorithm by adding weights,and the behavior categories are classified using Multi-Layer Perception.Experimental results on the UCF11,UCF50,and HMDB51 datasets demonstrate that our model effectively integrates the spatial-temporal and frame-sequence information of human actions,resulting in a significant improvement in recognition accuracy.Its classification accuracy on the three datasets was 99.17%,97.40%,and 96.88%,respectively,significantly enhancing the generalization capability and validity of conventional two-stream or three-stream models.
作者 黄瑞丰 陈冲 程睿 王旭 张龙凤 HUANG Ruifeng;CHEN Chong;CHENG Rui;WANG Xu;ZHANG Longfeng(School of Electronic and Information Engineering,Anhui Jianzhu University,Hefei 231600,China;Deep Learning and Computer Vision Lab of Anhui Jianzhu University,Hefei 231600,China;Anhui International Joint Research Center for Ancient Architecture Intellisencing and Multi-Dimensional Modeling,Hefei 231600,China)
出处 《池州学院学报》 2024年第3期21-27,共7页 Journal of Chizhou University
基金 国家自然科学基金项目(62001004) 安徽省高校省级自然科学研究项目(KJ2019A0768) 安徽建筑大学引进人才科研启动项目(2020QDZ24)。
关键词 人体行为识别 三流网络 帧序列特征 UCF11 UCF50 HMDB51 Human action recognition Three-stream network Frame sequence feature UCF11 UCF50 HMDB51
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部