一种基于帧序列特征的三流网络人体行为识别方法

A frame sequence feature based method on human action recognition

下载PDF

导出

摘要随着计算机科学和深度学习技术的发展,人体行为识别研究逐渐成为计算机视觉的一个重要课题。目前主流的双流网络模型无法做到在提取图像和运动特征的同时提取视频的帧间序列特征,当局部序列特征与长短时运动特征发生时空交互时,双流网络模型鲁棒性严重降低。针对于此,提出了一种基于视频序列特征的三流网络人体行为识别方法。通过预处理将视频的稠密光流帧输入时间网络,RGB帧输入空间网络和帧序列特征提取网络,同时对三个网络进行预训练。网络输出其对应的特征后使用权重相加的融合方法进行特征融合,最后采用多层感知机得到行为分类结果。将该方法分别在UCF11、UCF50和HMDB51数据集进行实验,得到行为分类准确率分别为99.17%、97.40%和96.88%。与传统的双流网络方法相比,该方法有效综合了行为的空间信息,时间信息和帧序列信息,识别准确率得到较大提升,具有更强的泛化能力。 Human action recognition research has evolved in tandem with the discipline of computer science and contemporary techniques of deep learning over time,which caused it is one of the most promising research directions in computer vision.In recent years the traditional two-stream network model is ineffective in extracting the interframe sequence features of the video simultaneously with the image and motion features.Resulting a decrease in two-stream model robustness when local sequence information and long-term motion information interact.Through pre-processing the dense optical flow frames of videos are entered into a temporal network,RGB frames into a spatial network and a frame sequence feature extracting network,all three networks are concurrently pretrained.At the conclusion of training,the operation of feature extraction is executed,the features are incorporated with the parallel fusion algorithm by adding weights,and the behavior categories are classified using Multi-Layer Perception.Experimental results on the UCF11,UCF50,and HMDB51 datasets demonstrate that our model effectively integrates the spatial-temporal and frame-sequence information of human actions,resulting in a significant improvement in recognition accuracy.Its classification accuracy on the three datasets was 99.17%,97.40%,and 96.88%,respectively,significantly enhancing the generalization capability and validity of conventional two-stream or three-stream models.

作者黄瑞丰陈冲程睿王旭张龙凤 HUANG Ruifeng;CHEN Chong;CHENG Rui;WANG Xu;ZHANG Longfeng(School of Electronic and Information Engineering,Anhui Jianzhu University,Hefei 231600,China;Deep Learning and Computer Vision Lab of Anhui Jianzhu University,Hefei 231600,China;Anhui International Joint Research Center for Ancient Architecture Intellisencing and Multi-Dimensional Modeling,Hefei 231600,China)

机构地区合肥涌现智能科技有限公司中国科学技术大学先进技术研究院安徽建筑大学电子与信息工程学院

出处《池州学院学报》 2024年第3期21-27,共7页 Journal of Chizhou University

基金国家自然科学基金项目(62001004) 安徽省高校省级自然科学研究项目(KJ2019A0768) 安徽建筑大学引进人才科研启动项目(2020QDZ24)。

关键词人体行为识别三流网络帧序列特征 UCF11 UCF50 HMDB51 Human action recognition Three-stream network Frame sequence feature UCF11 UCF50 HMDB51

分类号 O436 [机械工程—光学工程]

引文网络
相关文献

1郭兴欢.小微企业信贷的财务尽职调查框架设计[J].时代经贸,2024,21(9):81-84.
2石彩霞,贺小荣.中国旅游产业数字化水平的时空交互、动态演进及收敛性分析[J].统计与决策,2024,40(16):122-127.
3金萍,侯娟.面向新型电力系统的粗糙集和双流网络自动化物联设备故障诊断方法研究[J].电测与仪表,2024,61(9):166-171.
4张昊,邵可欣,宋继伟,丁鹏举,陈鑫.改进型Faster-RCNN配网线路防外破检测方法[J].智慧电力,2024,52(9):119-127.
5王露露,徐增敏,张雪莲,蒙儒省,卢涛.跨视图时序对比学习的自监督视频表征算法[J].计算机工程与应用,2024,60(18):158-166.
6高庆吉,徐达,罗其俊,邢志伟.基于深层动态特征双流网络的高效行为识别算法[J].计算机应用与软件,2024,41(9):175-181.
7陈元龙.改革开放以来我国校外培训机构管理政策变迁的动力分析——基于多源流理论视角[J].当代教育论坛,2024(5):24-32.
8朱涛,熊亚南.高等教育管理创新DEA网络方法在办学效率评价中的实证研究[J].中国科技经济新闻数据库教育,2024(10):0080-0084.
9王永明,龚超,范敏.基于SAOM方法的长三角地区旅游经济空间网络演变及影响因素研究[J].中国生态旅游,2024,14(2):447-461.
10曹卫焱,李梦.“双碳”目标下我国旅游生态效率空间网络结构研究[J].绿色科技,2024,26(15):221-227.

池州学院学报

2024年第3期

浏览历史

内容加载中请稍等...

一种基于帧序列特征的三流网络人体行为识别方法

相关作者

相关机构

相关主题

浏览历史