期刊文献+

基于视频的实时多人姿态估计方法 被引量:14

Real-Time Multi-Person Video-Based Pose Estimation
原文传递
导出
摘要 针对图像和视频中多人姿态估计存在人体边界框定位不准确、困难关键点检测精度有待提高等问题,设计了一套基于自顶向下框架的实时多人姿态估计模型。首先将深度可分离卷积加入目标检测算法中,提高人体检测器运行速度;然后基于特征金字塔网络结合上下文语义信息,采用在线难例挖掘算法解决困难关键点检测精度低的问题;最后结合空间变换网络与姿态相似度计算,剔除冗余姿态,改善边界框定位准确性。本文提出模型在2017MS COCO Test-dev数据集上的平均检测精度比Mask R-CNN模型提升了14.84%,比RMPE模型提升了2.43%,帧频达到22frame/s。 For multi-person pose estimation in images and videos,it is necessary to address the inaccurate positioning of the human-bounding box and improve the detection accuracy of hard keypoints.This paper designs a real-time multi-person pose-estimation model based on a top-down framework.First,depth-separable convolution is added to the target-detection algorithm to improve the running speed of the human detector;then,by combining the feature pyramid network with context-semantic information,the online hard-example mining algorithm is used to solve the problem of low detection accuracy at hard keypoints.Finally,combining the spatial-transformation network and pose-similarity calculation,the redundant pose is eliminated and the accuracy of the bounding-box positioning is improved.In this paper,the average detection precision of the proposed model on the 2017 MS COCO Test-dev dataset is 14.84%higher than that of the Mask R-CNN model,and 2.43%higher than that of the RMPE model.The frame frequency is 22 frame·s-1.
作者 闫芬婷 王鹏 吕志刚 丁哲 乔梦雨 Yan Fenting;Wang Peng;LüZhigang;Ding Zhe;Qiao Mengyu(School of Electronics and Information Engineering,Xi′an Technological University,Xi'an,Shaanxi 710021,China)
出处 《激光与光电子学进展》 CSCD 北大核心 2020年第2期89-96,共8页 Laser & Optoelectronics Progress
基金 国家自然科学基金(61671362) 陕西省科技厅重点研发计划(2019GY-022)。
关键词 图像处理 多人姿态估计 空间变换网络 语义信息 姿态距离 image processing multi-person pose estimation spatial transformer network semantic information pose distance
  • 相关文献

参考文献3

二级参考文献25

共引文献175

同被引文献103

引证文献14

二级引证文献28

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部