期刊文献+

部位级遮挡感知的人体姿态估计 被引量:7

Part-Level Occlusion-Aware Human Pose Estimation
下载PDF
导出
摘要 随着深度学习的快速发展,人体姿态估计技术近年来取得显著进步,但是现有方法仍难以较好地处理普遍存在的遮挡问题.针对此问题,提出一种部位级遮挡感知的人体姿态估计方法.首先,采用基准人体姿态估计网络从含遮挡噪声的图像中获得各人体部位的带噪声特征表达.然后,通过遮挡部位预测模块估计人体被遮挡部位,从而获得可见性向量.遮挡部位预测模块由遮挡部位分类网络和可见性编码器组成,前者预测关节点的遮挡状态,后者利用注意力机制将遮挡状态转换为一组权重.最后,通过通道重加权方式融合可见性向量和带噪声特征,获得部位级遮挡感知的人体部位相关特征,用于计算关节点热图.在MPII和LSP(leeds sports pose)数据集上的实验结果表明,相比基准姿态估计网络,该方法能够在较小的额外计算代价下更好地应对遮挡问题,并且取得了比目前先进方法更佳的结果. With the rapid development of deep learning,human pose estimation technology has made remarkable progress in recent years,but the existing methods are still difficult to deal with the common occlusion problem.To address this problem,a human pose estimation method based on keypoint-level occlusion inference is proposed in this paper.Firstly,a baseline human pose estimation network is used to obtain the noisy representation of each keypoint of human body from images with occlusion noises.Then,the occluded keypoints are estimated through the occlusion part prediction module to obtain the visibility vector.The occlusion part prediction module is proposed in this study,which consists of two submodules:occlusion part classification network and visibility encoder.The occlusion part classification network predicts the occlusion state of each keypoint of the human body.Based on the channel attention mechanism,the visibility encoder converts the predicted occlusion state into a set of weight parameters.Finally,the visibility vector and noise features are fused by channel re-weighting method to obtain the keypoint-level occlusion aware features,which are used to calculate the heatmaps of the keypoints.Experimental results on MPII and LSP(leeds sports pose)datasets show that,compared with the baseline human pose estimation network,the proposed method can better deal with the occlusion problem at a small extra computational cost,and achieve better results than existing state-of-the-art methods.
作者 褚真 米庆 马伟 徐士彪 张晓鹏 Chu Zhen;Mi Qing;Ma Wei;Xu Shibiao;Zhang Xiaopeng(Faculty of Information Technology,Beijing University of Technology,Beijing 100124;Artificial Intelligence School,Beijing University of Posts and Telecommunications,Beijing 100876;National Laboratory of Pattern Recognition(Institute of Automation,Chinese Academy of Sciences),Beijing 100190)
出处 《计算机研究与发展》 EI CSCD 北大核心 2022年第12期2760-2769,共10页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61771026,62176010) 北京市教委重点项目(KZ201910005008)。
关键词 人体姿态估计 人体关节点检测 遮挡推理 通道注意力机制 多任务学习 human pose estimation human keypoint detection occlusion inference channel attention mechanism multi-task learning
  • 相关文献

参考文献3

二级参考文献24

  • 1Felzenszwalb P F,Huttenlocher D P.Pictorial structures for object recognition.International Journal of Computer Vision,2005,61(1):55-79.
  • 2Felzenszwalb P F,Girshick R B,McAllester D,et al.Object detection with discriminatively trained part-based models.IEEE Transactions on Pattern Analysis and Machine Intelli gence,2010,32(9):1627-1645.
  • 3Zh S C,Mumford D.A stochastic grammar of images.Foundations and Trends in Computer Graphics and Vision,Boston:Now Publishers Inc.,2006.
  • 4Purdy E.Grammatical methods in computer vision[Ph.D.dissertation].The University of Chicago,Chicago,2013.
  • 5Girshick R B,Felzenszwalb P F,Mcallester D A.Object detection with grammar models//Proceedings of the 25th Annual Conference on Neural Information Processing Systems.Granada,Spain,2011:442-450.
  • 6Xi Song,Wu Tian Fu,Jia Yun De,et al.Discriminatively trained andor tree models for object detection//Proceedings of the 26th IEEE Conference on Computer Vision and Pattern Recognition.Portland,USA,2013:3278-3285.
  • 7Joo S W,Chellappa R.Attribute grammar-based event recognition and anomaly detection//Proceedings of the 19th IEEE Conference on Computer Vision and Pattern Recognition Workshop.New York,USA,2006:107-107.
  • 8Lin Liang,Wu Tian Fu,Porway J,Xu Zi-Jian.A Stochastic Graph Grammar for Compositional Object Representation and Recognition.Pattern Recognition,2009,42 (7):1297-1307.
  • 9Lin Liang,Wang Xiao-Long,Yang Wei,Lai Jian-Huang.Learning contour-fragment-based shape model with AndOr tree representation//Proceedings of the 25th IEEE Conference on Computer Vision and Pattern Recognition.Providence,USA,2012:135-142.
  • 10Wang Xiao-Long,Lin Liang.Dynamical and-or graph learning for object shape modeling and detection//Proceedings of the Advances in Neural Information Processing Systems.Lake Tahoe,USA,2012:242-250.

共引文献32

同被引文献54

引证文献7

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部