期刊文献+

基于时间动态帧选择与时空图卷积的可解释骨架行为识别

Temporal dynamic frame selection and spatio-temporal graph convolution for interpretable skeleton-based action recognition
下载PDF
导出
摘要 骨架行为识别是计算机视觉和机器学习领域的研究热点。现有数据驱动型神经网络往往忽略骨架序列时间动态帧选择和模型内在人类可理解的决策逻辑,造成可解释性不足。为此提出一种基于时间动态帧选择与时空图卷积的可解释骨架行为识别方法,以提高模型的可解释性和识别性能。首先利用骨架帧置信度评价函数删除低质骨架帧,以解决骨架序列噪声问题。其次基于人体运动领域知识,提出自适应时间动态帧选择模块用于计算运动行为显著区域,以捕捉关键人体运动骨架帧的动态规律。为学习行为骨架节点内在拓扑结构,改进时空图卷积网络用于可解释骨架行为识别。在NTU RGB+D,NTU RGB+D 120和FineGym这3个大型公开数据集上的实验评估表明,该方法的骨架行为识别准确率优于对比方法并具有可解释性。 Skeleton-based action recognition is a prominent research topic in computer vision and machine learning.Existing data-driven neural networks often overlook the temporal dynamic frame selection of skeleton sequences and lack the understandable decision logic inherent in the model,resulting in insufficient interpretability.To this end,we proposed an interpretable skeleton-based action recognition method based on temporal dynamic frame selection and spatio-temporal graph convolution,thereby enhancing the interpretability and recognition performance.Firstly,the quality of skeleton frames was estimated using the joint confidence to remove low-quality skeleton frames,addressing the skeleton noise problem.Secondly,based on the domain knowledge of human activity,an adaptive temporal dynamic frame selection module was proposed for calculating the motion salient regions to capture the dynamic patterns of key skeleton frames in human motion.To represent the intrinsic topology of human joints,an improved spatiotemporal graph convolutional network was used for interpretable skeleton-based action recognition.Experiments were conducted on three large public datasets,including NTU RGB+D,NTU RGB+D 120,and FineGym,and the results demonstrated that the recognition accuracy of this method outperformed comparative methods and possessed interpretability.
作者 梁成武 杨杰 胡伟 蒋松琪 钱其扬 侯宁 LIANG Chengwu;YANG Jie;HU Wei;JIANG Songqi;QIAN Qiyang;HOU Ning(College of Electrical Engineering and New Energy,China Three Gorges University,Yichang Hubei 443002,China;School of Electrical and Control Engineering,Henan University of Urban Construction,Pingdingshan Henan 467036,China)
出处 《图学学报》 CSCD 北大核心 2024年第4期791-803,共13页 Journal of Graphics
基金 国家自然科学基金项目(62176086,U1804152) 河南省科技攻关计划项目(242102211055)。
关键词 行为识别 骨架序列 可解释 运动显著区域 时空图卷积网络 action recognition skeleton sequence interpretability motion salient regions spatio-temporal graph convolutional network
  • 相关文献

参考文献4

二级参考文献32

  • 1王向东,张静文,毋立芳,徐文泉.一种运动轨迹引导下的举重视频关键姿态提取方法[J].图学学报,2014,35(2):256-261. 被引量:4
  • 2沈军行,孙守迁,潘云鹤.从运动捕获数据中提取关键帧[J].计算机辅助设计与图形学学报,2004,16(5):719-723. 被引量:44
  • 3王方石,须德,吴伟鑫.基于自适应阈值的自动提取关键帧的聚类算法[J].计算机研究与发展,2005,42(10):1752-1757. 被引量:32
  • 4Galna B, Barry G, Jackson D, et al. Accuracy of theMicrosoft Kinect sensor for measuring movement inpeople with Parkinson’s disease [J]. Gait & Posture,2014, 39(4): 1062-1068.
  • 5Zhang Z Y. Microsoft Kinect sensor and its effect [J].IEEE MultiMedia, 2012, 19(2): 4-10.
  • 6Wang C Y, Wang Y Z, Yuille A L. An approach topose-based action recognition [C].CVPR 2013: 26thProceedings of the IEEE Conference on ComputerVision and Pattern Recognition. New York: IEEE Press,2013: 915-922.
  • 7Seidenari L, Varano V, Berretti S, et al. Recognizing actionsfrom depth cameras as weakly aligned multi-partbag-of-poses [C].Computer Vision and Pattern RecognitionWorkshops (CVPRW), IEEE Conference on. New York:IEEE Press, 2013: 479-485.
  • 8Beaudoin P, Coros S, van de Panne M, et al.Motion-motif graphs [C].ACM Siggraph/EurographicsSymposium on Computer Animation (2008). New York:ACM Press, 2008: 117-126.
  • 9Müller M, Baak A, Seidel H P. Efficient and robustannotation of motion capture data [C].ACM Siggraph/Eurographics Symposium on Computer Animation(2009). New York: ACM Press, 2009: 17-26.
  • 10Barbi?J, Safonova A, Pan J Y, et al. Segmenting motioncapture data into distinct behaviors [J]. Proceedings ofGraphics Interface 2004. Canadian Human-ComputerCommunications Society, 2004, (5): 185-194.

共引文献165

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部