多模态时空特征表示及其在行为识别中的应用被引量：3

Multimodal spatial-temporal feature representation and its application in action recognition

导出

摘要目的在人体行为识别研究中,利用多模态方法将深度数据与骨骼数据相融合,可有效提高动作的识别率。针对深度图像信息数据量大、冗余度高等问题,提出一种通过获取关键时程信息动作帧序列降低冗余的算法,即质心运动路径松弛算法,并根据不同模态数据的特点,提出一种新的时空特征表示方法。方法质心运动路径松弛算法根据质心在相邻帧之间的运动距离,计算图像差分后获得的活跃部分的相似系数,然后剔除掉相似度高的帧,获得足以表达行为的关键时程信息。根据图像动态部分的变化特性、人体各部分在运动中的协同性和局部显著性特征构建一种新的时空特征表示方法。结果在MSR-Action3D数据集上对本文方法的效果进行验证。在3个子集中进行交叉验证的平均分类识别率为95.7432%,分别比Multi-fused,CovP3DJ,D3D-LSTM(densely connected 3DCNN and long short-term memory),Joint Subset Selection方法高2.4432%,4.7632%,0.3432%,0.2132%。本文方法在使用完整数据集的扩展实验中进行交叉验证的分类识别率为93.0403%,具有很好的鲁棒性。结论实验结果表明,本文提出的去冗余算法在降低冗余后提升了识别效果,提取的特征之间具有相关性低的特点,在组合识别中具有良好的互补性,有效提高了分类识别的精确度。 Objective Human body motion-related recognition has been developing in the context of computer vision and pattern recognition like auxiliary human-computer interaction,motion analysis,intelligent monitoring,and virtual reality.To obtain two-dimensional information for its behavioral recognition,conventional motion behavior recognition is mainly used the RGB image sequence captured by RGB camera.To improve the ability to detect short-duration fragments,current feature descriptors for RGB image sequences are employed to characterize human behavior,such as histogram of oriented gradient(HOG),histogram of optical flow(HOF),and a three-dimensional feature pyramid.Some researchers are focused on the feature that image depth is insensitive to ambient light since RGB images are oriented to behavior image sequences of objects in terms of two-dimensional information.The depth information of the image is coordinated with the features of RGB image to describe the related behavior.Human behavior recognition-relevant multi-modal method can be used to fuse depth data and skeleton data,which can improve the recognition rate of action effectively.Recent depth map is widely used in relevant to human behavior recognition.But,the collection of depth information data is required to be opti⁃mized because of time complexity of feature extraction and space complexity of feature storage.To resolve the problems,we develop an algorithm to optimize frames of the depth map and resource consumption.At the same time,a new representa⁃tion of motion features is facilitated as well according to the motion information of the centroid.Method First,the temporal feature vector is used in terms of depth map sequence-extracted time sequence information.The centroid motion path relax⁃ation algorithm is used to realize depth image de-duplication and de redundancy,and the skeleton map-extracted spatial structure feature vector from are spliced to form the spatio-temporal feature input.Next,spatial features are extracted in terms of the original skeleton points coordinates-spliced three-channel spatial feature map.Finally,the fusion probability of spatio-temporal features and spatial features is used for classification and recognition.Our centroid motion path relaxation algorithm is focused on the optimization of redundant information,the time complexity of feature extraction,and the space complexity of feature storage.For the skeleton data,the global feature of motion direction is proposed to fully reflect the integrity and coordination of limb movements.The extracted features are concatenated to obtain the spatio-temporal feature vector,and they can be fused and enhanced through the original coordinates of skeleton points-built three-channel spatial feature map.Its effectiveness is verified on the MSR-Action3D dataset.Result The experimental setting 1 demonstrate that it is 0.8260%higher than the depth motion map(DMM)-local binary pattern(LBP)algorithm,1.0152%higher than DMM-CRC(collaborative representation classifier),3.4501%higher than gradient local auto correlation(DMM-GLAC)algorithm,0.6058%higher than EigenJoint algorithm,and 0.6058%higher than space-time auto correlation of gradient(STACOG)algorithm is 10.6245%higher.After removing redundancy,the result of experimental setting 1 is 0.1261%higher as well.The cross-validation on experimental setting 2 show that the average classification and recognition rate in the three subsets is 95.7432%,2.4432%higher than multi-fused method,4.7632%higher than CovP3DJ method,0.3432%higher than D3D-LSTM method,and 0.2132%higher than joint subset selection method.For the overall data set,it is 2.0303%higher than low latency method,0.2403%higher than combination of deep models method,and 2.3403%higher than complex network coding method.The experimental setting 2 illustrates that the average classification recognition rate of cross-validation in three subsets is 95.7432%,and the classification recognition rate of the complete dataset is 93.0403%.Conclusion Our algorithm proposed can improve the recognition effect based on redundancy-optimized,and the featuresextracted have lower correlation mutually,which can improve the accuracy of classification recognition effectively.

作者施海勇侯振杰巢新钟卓锟 Shi Haiyong;Hou Zhenjie;Chao Xin;Zhong Zhuokun(School of Computer and Artificial Intelligence,Changzhou University,Changzhou 213164,China)

机构地区常州大学计算机与人工智能学院

出处《中国图象图形学报》 CSCD 北大核心 2023年第4期1041-1055,共15页 Journal of Image and Graphics

基金国家自然科学基金项目(61063021) 江苏省研究生科研创新计划项目(KYCX21_2835)。

关键词行为识别质心运动关键时程信息时空特征表示多模态融合 action recognition centroid motion key temporal information spatio-temporal feature representation multi⁃modal fusion

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献7

1马钰锡,谭励,董旭,于重重.面向智能监控的行为识别[J].中国图象图形学报,2019,24(2):282-290. 被引量：35
2刘庭煜,陆增,孙毅锋,刘芳,何必秒,钟杰.基于三维深度卷积神经网络的车间生产行为识别[J].计算机集成制造系统,2020,26(8):2143-2156. 被引量：17
3刘婷婷,李玉鹏,张良.多视角深度运动图的人体行为识别[J].中国图象图形学报,2019,0(3):400-409. 被引量：5
4李瑞峰,王亮亮,王珂.人体动作行为识别研究综述[J].模式识别与人工智能,2014,27(1):35-48. 被引量：96
5胡珂杰,蒋敏,孔军.基于混合关节特征的人体行为识别[J].传感器与微系统,2018,37(3):138-140. 被引量：5
6何嘉宇,雷军,李国辉.特征金字塔结构的时序行为识别网络[J].中国图象图形学报,2021,26(7):1637-1647. 被引量：6
7巢新,侯振杰,李兴,梁久祯,宦娟,刘浩昱.深度时空能量特征表示下的人体行为识别[J].中国图象图形学报,2020,25(4):836-850. 被引量：2

二级参考文献145

1张鹏,王润生.静态图像中的感兴趣区域检测技术[J].中国图象图形学报（A辑）,2005,10(2):142-148. 被引量：32
2Mokhber A,Achard C,Milgram M. Recognition of Human Behavior by Space-Time Silhouette Characterization[J].Pattern Recognition Let-ters,2008,(01):81-89.
3Polat E,Yeasin M,Sharma R. Robust Tracking of Human Body Parts for Collaborative Human Computer Interaction[J].{H}COMPUTER VISION AND IMAGE UNDERSTANDING,2003,(01):44-69.
4Kjellstr?m H,Romero J,Kragic' D. Visual Object-Action Recogni-tion:Inferring Object Affordances from Human Demonstration[J].{H}COMPUTER VISION AND IMAGE UNDERSTANDING,2011,(01):81-90.
5Suma E A,Krum D M,Lange B. Adapting User Interfaces for Gestural Interaction with the Flexible Action and Articulated Skele-ton Toolkit[J].Computers& Graphics,2012,(03):193-201.
6Ayers D,Shah M. Monitoring Human Behavior from Video Taken in an Office Environment[J].{H}IMAGE AND VISION COMPUTING,2001,(12):833-846.
7López M T,Fernández-Caballero A,Fernández M A. Visual Surveillance by Dynamic Visual Attention Method[J].Pattern Recogni-tion,2006,(11):2194-2211.
8Aggarwal J K,Park S. Human Motion:Modeling and Recognition of Actions and Interactions[A].Thessaloniki,Greece,2004.640-647.
9Moeslund T B,Hilton A,Krüger V. A Survey of Advances in Vision-Based Human Motion Capture and Analysis[J].{H}COMPUTER VISION AND IMAGE UNDERSTANDING,2006,(2/3):90-126.
10Poppe R. A Survey on Vision-Based Human Action Recognition[J].{H}IMAGE AND VISION COMPUTING,2010,(06):976-990.

共引文献155

1建中华,南静,刘鑫,代伟.基于时空张量融合的人体骨架行为自适应识别方法[J].仪器仪表学报,2023,44(6):74-85. 被引量：1
2吴晨,孙强,倪宏宇,颜文旭.基于骨架序列提取的异常行为识别[J].计算机系统应用,2022,31(11):215-222. 被引量：1
3许志豪,高铭,殷绍轩,崔杰.基于OpenPose的滑雪动作分析[J].智能计算机与应用,2022,12(4):101-103. 被引量：6
4王刘旺,周自强,林龙,韩嘉佳.人工智能在变电站运维管理中的应用综述[J].高电压技术,2020,46(1):1-13. 被引量：80
5周前祥,郭华岭,廖德智.载人航天器地理位置指示器工效学设计仿真软件的研制[J].计算机仿真,2000,17(1):60-63.
6张飞燕,李俊峰.基于光流速度分量加权的人体行为识别[J].浙江理工大学学报（自然科学版）,2015,33(1):115-123. 被引量：2
7应锐,蔡瑾,冯辉,杨涛,胡波.基于运动块及关键帧的人体动作识别[J].复旦学报（自然科学版）,2014,53(6):815-822. 被引量：6
8张飞燕,李俊峰,沈军民.基于梯度和光流统计特性的人体行为识别[J].光电子．激光,2015,26(8):1593-1601. 被引量：5
9田铠,卢振利,徐惠钢,顾启民,毛丽民,陈勇,李斌.中型组足球机器人传球动作辨识与再现[J].高技术通讯,2015,25(6):614-621. 被引量：1
10杨凯,魏本征,任晓强,王庆祥,刘怀辉.基于深度图像的人体运动姿态跟踪和识别算法[J].数据采集与处理,2015,30(5):1043-1053. 被引量：14

同被引文献12

1许艳,侯振杰,梁久祯,陈宸,贾靓,宋毅.权重融合深度图像与骨骼关键帧的行为识别[J].计算机辅助设计与图形学学报,2018,30(7):1313-1320. 被引量：8
2汪成峰,陈洪,张瑞萱,朱德海,王庆,梅树立.带有关节权重的DTW动作识别算法研究[J].图学学报,2016,37(4):537-544. 被引量：5
3赵洪,宣士斌.人体运动视频关键帧优化及行为识别[J].图学学报,2018,39(3):463-469. 被引量：5
4李兴,侯振杰,梁久祯,常兴治.分段双向去除反向重力加速度算法[J].计算机辅助设计与图形学学报,2019,31(4):560-572. 被引量：1
5张钹,朱军,苏航.迈向第三代人工智能[J].中国科学：信息科学,2020,50(9):1281-1302. 被引量：178
6Meng-Hao Guo,Jun-Xiong Cai,Zheng-Ning Liu,Tai-Jiang Mu,Ralph R.Martin,Shi-Min Hu.PCT:Point cloud transformer[J].Computational Visual Media,2021,7(2):187-199. 被引量：122
7陶帅兵,梁冲,蒋腾平,杨玉娇,王永君.激光点云的稀疏体素金字塔邻域构建与分类[J].中国图象图形学报,2021,26(11):2703-2712. 被引量：6
8张洋,姚登峰,江铭虎,李凡姝.基于EfficientDet网络的细粒度吸烟行为识别[J].计算机工程,2022,48(3):302-309. 被引量：6
9蒲瞻星,葛永新.基于多特征融合的小样本视频行为识别算法[J].计算机学报,2023,46(3):594-608. 被引量：6
10闫兴亚,匡娅茜,白光睿,李月.基于深度学习的学生课堂行为识别方法[J].计算机工程,2023,49(7):251-258. 被引量：6

引证文献3

1尤凯军,侯振杰,梁久祯,钟卓锟,施海勇.结合坐标转换和时空信息注入的点云人体行为识别[J].中国图象图形学报,2024,29(4):1056-1069.
2梁成武,杨杰,胡伟,蒋松琪,钱其扬,侯宁.基于时间动态帧选择与时空图卷积的可解释骨架行为识别[J].图学学报,2024,45(4):791-803. 被引量：1
3钟忺,陈亮,刘文璇,叶舒,江奎,王正,林嘉文.时空语义驱动的渐进多视角行为去偏置研究[J].计算机工程,2025,51(1):1-10.

二级引证文献1

1邓长海,葛辉,李俊卓,李皓,周煦原,苗孔号.基于深度学习的地铁乘客异常行为识别系统的研究与实现[J].轨道交通装备与技术,2025,33(1):19-23.

1Hossien Riahi-Madvar,Mahsa Gholami,Bahram Gharabaghi,Seyed Morteza Seyedian.A predictive equation for residual strength using a hybrid of subset selection of maximum dissimilarity method with Pareto optimal multi-gene genetic programming[J].Geoscience Frontiers,2021,12(5):342-354. 被引量：2
2王思楠,李瑞平,吴英杰,赵水霞,王秀青.基于环境变量和机器学习的土壤水分反演模型研究[J].农业机械学报,2022,53(5):332-341. 被引量：12
3徐来永.多模态思路与大学英语教学体系的建设[J].中国科技经济新闻数据库教育,2023(3):84-87.
4Anqi Wang,Gary C. McDonald.Analysis of State Homicide Rates Using Statistical Ranking and Selection Procedures[J].Applied Mathematics,2022,13(7):585-601.
5葛均浩.探析核心素养导向下的高中化学实验教学策略[J].中文科技期刊数据库（引文版）教育科学,2023(2):37-40.
6曾闻,王曰芬.专业技术领域核心专利组合识别方法构建及其应用比较[J].数据分析与知识发现,2022,6(11):61-71. 被引量：4
7田华,刘迎春,刘艳华.中国古建筑文化术语英译图文关系研究[J].中国科技翻译,2023,36(1):1-4. 被引量：3
8陈晓东,赵欢,陈泽霖,康捷,王夏婷.航检影像智能分析与缺陷识别系统设计[J].数字通信世界,2023(1):24-26.
9李毓烜,阚强,崔海浩.储能系统用三元锂离子电池热失控火灾特性[J].电源技术,2023,47(3):328-331. 被引量：5
10王红霞,姚金成.基于全边界时空切片的安防视频分割算法[J].信息技术与信息化,2023(2):14-17.

中国图象图形学报

2023年第4期

浏览历史

内容加载中请稍等...

多模态时空特征表示及其在行为识别中的应用被引量：3

参考文献7

二级参考文献145

共引文献155

同被引文献12

引证文献3

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

多模态时空特征表示及其在行为识别中的应用 被引量：3

参考文献7

二级参考文献145

共引文献155

同被引文献12

引证文献3

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

多模态时空特征表示及其在行为识别中的应用被引量：3