期刊文献+

一种基于时空运动信息交互建模的三维人体姿态估计方法

A 3D human pose estimation approach based on spatio-temporal motion interaction modeling
下载PDF
导出
摘要 三维人体姿态估计在虚拟现实和人机交互等领域具有重要作用。近年来,Transformer已被引入三维人体姿态估计领域,用于捕捉人体关节点的时空运动信息。然而,现有研究通常只关注于人体关节点群的整体运动,或只对单独的人体关节点运动进行建模,均没有深入地探讨每个关节点的独特运动模式及不同关节点运动间的相互影响。因此,提出了一种创新的方法,旨在细致地学习每帧中的二维人体关节点的空间信息,并对每个关节点的特定运动模式进行深入分析。通过设计一个基于Transformer编码器的运动信息交互模块,精确地捕捉不同关节点之间的动态运动关系。相较于已有直接对人体关节点的整体运动进行学习的模型,此方法能够使得预测精度提高约3%。与注重单节点运动的最先进MixSTE模型相比,该模型在捕捉关节点的时空特征方面更为高效,推理速度实现了20%以上提升,使其更适合于实时推理的场景。 3D human pose estimation plays a crucial role in fields such as virtual reality and human-computer interaction.In recent years,the Transformer has been introduced into the domain of 3D human pose estimation to capture the spatiotemporal motion information of human joints.However,existing studies typically focus on the collective movement of joint clusters or exclusively model the movement of individual joints,without delving into the unique movement patterns of each joint and their interdependencies.Consequently,an innovative approach was proposed,which meticulously learnt the spatial information of 2D human joints in each frame and conducted an in-depth analysis of the specific movement patterns of each joint.Through the design of a motion information interaction module based on the Transformer encoder,the proposed method accurately captured the dynamic relationships between different joints.In comparison to existing models that directly learnt the overall motion of human joints,the proposed method enhanced prediction accuracy by approximately 3%.When benchmarked against the state-of-the-art MixSTE model,which primarily focused on individual joint movement,the proposed model demonstrated greater efficiency in capturing spatiotemporal features of joints,achieving an inference speed boost of over 20%,making it especially suitable for real-time inference scenarios.
作者 吕衡 杨鸿宇 LV Heng;YANG Hongyu(School of Computer Science and Engineering,Beihang University,Beijing 100191,China;Institute of Artificial Intelligence,Beihang University,Beijing 100191,China)
出处 《图学学报》 CSCD 北大核心 2024年第1期159-168,共10页 Journal of Graphics
基金 北京市自然科学基金项目(4222049) 国家自然科学基金项目(62202031)。
关键词 3D人体姿态估计 Transformer编码器 关节点间运动 时空信息关联 实时推理 3D human pose estimation Transformer encoder inter-joint motion temporal-spatial information correlation real-time inference
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部