摘要
针对人体姿态估计中面对特征图尺度变化的挑战时难以预测人体的正确姿势,提出了一种基于多尺度注意力机制的高分辨率网络MSANet(multiscale-attention net)以提高人体姿态估计的检测精度。引入轻量级的金字塔卷积和注意力特征融合以更高效地完成多尺度信息的提取;在并行子网的融合中引用自转换器模块进行特征增强,获取全局特征;在输出阶段中将各层的特征使用自适应空间特征融合策略进行融合后作为最后的输出,更充分地获取高层特征的语义信息和底层特征的细粒度特征,以推断不可见点和被遮挡的关键点。在公开数据集COCO2017上进行测试,实验结果表明,该方法比基础网络HRNet的估计精度提升了4.2%。
It is difficult to predict the correct human poses when facing the challenge of the scale change of the feature map in the human pose estimation.To solve this problem,this paper proposed a high-resolution network MSANet(multiscale-attention net) based on multi-scale attention mechanism to improve the detection accuracy of human pose estimation.It introduced lightweight pyramid convolution and attention feature fusion to achieve more efficient extraction of multi-scale information,cited the self-transformer module in the fusion of parallel subnets for feature enhancement to obtain global features.In the output stage,the features of each layer were fused using an adaptive spatial feature fusion strategy as the final output,which more fully obtained the semantic information of high-level features and the fine-grained features of low-level features to infer invisible points and occluded key points.Tested on the public dataset COCO2017,the experimental results show that this method improves the estimation accuracy by 4.2% compared with the basic network HRNet.
作者
李丽
张荣芬
刘宇红
陈娜
张雯雯
Li Li;Zhang Rongfen;Liu Yuhong;Chen Na;Zhang Wenwen(College of Big Data and Information Engineering,Guizhou University,Guiyang 550025,China)
出处
《计算机应用研究》
CSCD
北大核心
2022年第11期3487-3491,3497,共6页
Application Research of Computers
基金
贵州省科学技术基金资助项目(黔科合基础-ZK[2021]重点001)。