期刊文献+

面向人体动作预测的对称残差网络

Symmetric Residual Network for Human Motion Prediction
原文传递
导出
摘要 为了研究不同残差连接方式对人体动作预测卷积神经网络的影响,探讨了在保持网络深度一定的情况下,如何利用残差连接构成一个高效捕捉人体动作特征的预测模型。通过观察人体骨骼关节点排列方式,提出一种适用于人体骨骼关节点预测的对称残差连接方法,并基于该方法设计了对称残差块(symmetric residual block,SRB)。所设计的SRB,最后一层卷积核的感受野达到最大,覆盖了人体全部关节信息,采用的对称连接方式高效地利用浅层动态特征,使预测的效果更好、模型使用的参数更少。此外,本文提出一种基于2个SRB和1个解码器的端到端卷积网络——对称残差网络(symmetric residual network,SRNet),取得的预测结果高于基准方法。最后,在TensorFlow框架下利用公开数据集Human3.6M和CMU-Mocap进行了人体动作预测实验。其结果表明,与基准方法相比,本文方法的关节位置平均误差(mean per joint postion error,MPJPE)在各个预测时间点上均有0.2 mm~1 mm的降低,验证了本文提出的SRNet能有效建模人体姿态的全局空间特征。 To study the influence of different residual connection methods on CNN(convolutional neural network) for human motion prediction, this paper investigates how to use residual connection to construct an effective prediction model for capturing the human motion features by the network with a certain depth. Through observing the arrangement of human skeletal joints, a symmetric residual connection method is proposed for predicting the human skeletal joints, and a symmetric residual block(SRB) is designed based on the proposed method. In the designed SRB, the receptive field of the last convolution kernel is maximized, covering all the joint information of the human body. The symmetric connection method is adopted to efficiently utilize the shallow dynamic features, and consequently improve the prediction performance and reduce the model parameters. Based on two SRBs and one decoder, an end-to-end convolutional network is proposed, named as symmetric residual network(SRNet), by which a higher accuracy is achieved comparing with the baseline methods. In the framework of TensorFlow, human motion prediction experiments are carried out on two public datasets, Human3.6M and CMU-Mocap.The results indicate that, the proposed method reduces the mean per joint position error(MPJPE) by 0.2 mm~1 mm at each prediction time point comparing with the baseline methods, which confirms the effectiveness of the proposed SRNet for modeling the human global spatial features.
作者 张晋 唐进 尹建芹 ZHANG Jin;TANG Jin;YIN Jianqin(School of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China)
出处 《机器人》 EI CSCD 北大核心 2022年第3期291-298,共8页 Robot
基金 国家自然科学基金(61673192) 中央高校基本科研业务费(2020XD-A04-2)。
关键词 人体动作预测 对称残差连接 卷积神经网络 骨骼关节点建模 human motion prediction symmetric residual connection convolutional neural network skeletal joints modeling
  • 相关文献

参考文献4

二级参考文献37

  • 1刘今越,李顺达,陈梦倩,郭士杰.面向移乘搬运护理机器人的人体姿态视觉识别[J].机器人,2019,41(5):601-608. 被引量:14
  • 2Dautenhahn K. Socially intelligent robots: Dimensions of human-robot interaction[J]. Philosophical Transactions of the Royal Society of London, B: Biological Sciences, 2007, 362(1480): 679-704.
  • 3Atkeson C G, Hale J G, Pollick F E, et al. Using humanoid robots to study human behavior[J]. IEEE Intelligent Systems and Their Applications, 2000, 15(4): 46-55.
  • 4Yang Y Z, Li Y, Fermtiller C, et al. Robot learning manipula- tion action plans by "watching" unconstrained videos from the World Wide Web[C]//Proceedings of the 29th AAAI Confer- ence on Artificial Intelligence. 2015: 3686-3693.
  • 5Koppula H S, Gupta R, Saxena A. Learning human activities and object affordances from RGB-D videos[J]. International Journal of Robotics Research, 2013, 32(8): 951-970.
  • 6Yang Y, Ramanan D. Articulated human detection with flexible mixtures of parts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(12): 2878-2890.
  • 7Dantone M, Gall J, Leistner C, et al. Human pose estimation us- ing body parts dependent joint regressors[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2013: 3041-3048.
  • 8Fischler M A, Elschlager R A. The representation and match- ing of pictorial structures[J]. IEEE Transactions on Computers, 1973, 22(1): 67-92.
  • 9Freifeld O, Weiss A, Zuffl S, et al. Contour people: A parame- terized model of 2D articulated human shape[C]//IEEE Confer- ence on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2010: 639-646.
  • 10Zuffi S, Freifeld O, Black M J. From pictorial structures to de- formable structures[C]//IEEE Conference on Computer Vision and Pattern Recognition. Piscataway, USA: IEEE, 2012: 3546- 3553.

共引文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部