期刊文献+

融合Transformer和注意力的轻量高效人体姿态估计

Lightweight and Efficient Human Pose Estimation Fusing Transformer and Attention
下载PDF
导出
摘要 针对人体姿态估计算法的沉重计算成本和庞大网络规模问题,提出面向人体姿态估计的轻量级高效视觉变换器(lightweight efficient vision transformer for human posture estimation,LEViTPose)。引入深度可分离卷积、通道混洗和多尺度卷积核并行技术来设计轻量级预处理模块LStem;提出一种级联组空间线性退化注意力(cascaded group spatial linear reduction attention,CGSLRA),采用特征分组划分多个注意头的方式来提升内存存储效率,采用组内特征降维来降低计算冗余;通过引入逐点卷积和分组反卷积来设计轻量级特征还原模块(lightweight feature recovery module,LFRM)。实验结果表明,所提方法相比基线模型,可以在提升网络性能和推理速度的同时降低网络规模和计算开销。在MPII和COCO验证集上与LiteHRNet-30相比,平均准确率分别提高了2.6和3.4个百分点,推理速度提升了1倍。 Aiming at the heavy computational cost and huge network scale problem of human posture estimation algo-rithms,lightweight efficient vision transformer for human posture estimation(LEViTPose)is proposed.Firstly,a light-weight preprocessing module LStem is designed by introducing deepwise separable convolution,channel shuffle and multi-scale convolution kernel parallel techniques.Then,a cascaded group spatial linear reduction attention(CGSLRA)is proposed,which uses feature grouping to divide multiple attention heads to improve memory efficiency,and uses intra-group feature dimension reduction to reduce computational redundancy.Finally,a lightweight feature recovery module(LFRM)is designed by introducing pointwise convolution and group transposed convolution.The experimental results show that the proposed method can improve the network performance and inference speed while reducing the network size and computational overhead compared to the baseline model.Compared with LiteHRNet-30 on the MPII and COCO validation sets,the average accuracy is improved by 2.6 and 3.4 percentage points,and the inference speed is increased by a factor of 1.
作者 吴程鹏 谭光兴 陈海峰 李春宇 WU Chengpeng;TAN Guangxing;CHEN Haifeng;LI Chunyu(College of Automation,Guangxi University of Science and Technology,Liuzhou,Guangxi 545616,China)
出处 《计算机工程与应用》 CSCD 北大核心 2024年第22期197-208,共12页 Computer Engineering and Applications
基金 国家自然科学基金(61563005)。
关键词 人体姿态估计 轻量级网络 注意力机制 TRANSFORMER human pose estimation lightweight network attention mechanism Transformer
  • 相关文献

参考文献2

二级参考文献6

共引文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部