一种基于时空运动信息交互建模的三维人体姿态估计方法

A 3D human pose estimation approach based on spatio-temporal motion interaction modeling

下载PDF

导出

摘要三维人体姿态估计在虚拟现实和人机交互等领域具有重要作用。近年来,Transformer已被引入三维人体姿态估计领域,用于捕捉人体关节点的时空运动信息。然而,现有研究通常只关注于人体关节点群的整体运动,或只对单独的人体关节点运动进行建模,均没有深入地探讨每个关节点的独特运动模式及不同关节点运动间的相互影响。因此,提出了一种创新的方法,旨在细致地学习每帧中的二维人体关节点的空间信息,并对每个关节点的特定运动模式进行深入分析。通过设计一个基于Transformer编码器的运动信息交互模块,精确地捕捉不同关节点之间的动态运动关系。相较于已有直接对人体关节点的整体运动进行学习的模型,此方法能够使得预测精度提高约3%。与注重单节点运动的最先进MixSTE模型相比,该模型在捕捉关节点的时空特征方面更为高效,推理速度实现了20%以上提升,使其更适合于实时推理的场景。 3D human pose estimation plays a crucial role in fields such as virtual reality and human-computer interaction.In recent years,the Transformer has been introduced into the domain of 3D human pose estimation to capture the spatiotemporal motion information of human joints.However,existing studies typically focus on the collective movement of joint clusters or exclusively model the movement of individual joints,without delving into the unique movement patterns of each joint and their interdependencies.Consequently,an innovative approach was proposed,which meticulously learnt the spatial information of 2D human joints in each frame and conducted an in-depth analysis of the specific movement patterns of each joint.Through the design of a motion information interaction module based on the Transformer encoder,the proposed method accurately captured the dynamic relationships between different joints.In comparison to existing models that directly learnt the overall motion of human joints,the proposed method enhanced prediction accuracy by approximately 3%.When benchmarked against the state-of-the-art MixSTE model,which primarily focused on individual joint movement,the proposed model demonstrated greater efficiency in capturing spatiotemporal features of joints,achieving an inference speed boost of over 20%,making it especially suitable for real-time inference scenarios.

作者吕衡杨鸿宇 LV Heng;YANG Hongyu(School of Computer Science and Engineering,Beihang University,Beijing 100191,China;Institute of Artificial Intelligence,Beihang University,Beijing 100191,China)

机构地区北京航空航天大学计算机学院北京航空航天大学人工智能研究院

出处《图学学报》 CSCD 北大核心 2024年第1期159-168,共10页 Journal of Graphics

基金北京市自然科学基金项目(4222049) 国家自然科学基金项目(62202031)。

关键词 3D人体姿态估计 Transformer编码器关节点间运动时空信息关联实时推理 3D human pose estimation Transformer encoder inter-joint motion temporal-spatial information correlation real-time inference

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1宋一然,周千寓,邵志文,易冉,马利庄.基于动态采样对偶可变形网络的实时视频实例分割[J].浙江大学学报（工学版）,2024,58(2):247-256.
2周圆圆.素质教育背景下初中数学课堂教学的变革与创新[J].数学大世界（上旬）,2023(9):44-46.
3肖男.低养护成本前提下的高质量园林养护管理分析[J].花卉,2024(6):49-51.
4邓一猛,赵哲赛,韩庆文,韩小龙.胆酸在5种溶剂中的溶解度测定及关联[J].高校化学工程学报,2023,37(5):850-857.
5金凤.基于交互式理念的包装设计研究[J].鞋类工艺与设计,2024,4(4):33-35.
6陈晓伟,李煊鹏,张为公.基于动态图注意力的车辆轨迹预测研究[J].汽车技术,2024(3):24-30.
7郭叶丹,郭俊含,张树龙.P波参数结合人工智能算法在心房颤动检测中的价值[J].中国心血管病研究,2024,22(3):207-212.
8丁梦丝.PACTE翻译模式视域下回译对高职学生英语翻译学习的作用研究[J].海外英语,2024(4):199-202.
9徐新亚,陈碧霞,吴晓婷.基于PI3K/AKT通路探讨何氏补肾厚膜方改善大鼠薄型子宫内膜的作用机制[J].浙江中西医结合杂志,2024,34(3):214-218.
10向琼,闵若惜.传统音乐文化在中小学音乐教学中的传承与创新[J].新课程教学（电子版）,2023(23):117-118.

图学学报

2024年第1期

浏览历史

内容加载中请稍等...

一种基于时空运动信息交互建模的三维人体姿态估计方法

相关作者

相关机构

相关主题

浏览历史