摘要
针对物体6D姿态估计易受目标物体的弱纹理和小体积特性、复杂背景、遮挡的影响,提出一种结合特征融合和注意力机制的物体6D姿态估计算法。首先,在RGB图像特征提取网络的首个卷积块中加入卷积注意力模块,提升弱纹理小物体的区域显著度;其次,在基于编解码结构的RGB图像特征提取网络中引入基于卷积注意力模块的跳跃连接,有效地将编码阶段的颜色、纹理等细节外观特征融合到解码阶段的姿态语义特征中,弥补姿态语义特征缺乏细节外观特征的问题;然后,使用通道注意力模块改进池化金字塔模块,增强目标物体可见区域与遮挡区域的联系,提升遮挡鲁棒性;最后,使用卷积注意力模块重构解码阶段输出的姿态语义特征,增强相似表面特征的区分度,从而降低外观相似物体对物体6D姿态估计的干扰。实验结果表明,该算法在Occlusion LINEMOD数据集和LINEMOD数据集上ADD(-S)指标分别达到73.4%和99.8%,与FFB6D相比,分别提升7.8百分点和0.1百分点,验证了该算法的可行性。
Object 6D pose estimation is easily affected by the weak texture and small volume characteristics of the target object,complex background,and occlusion.To solve the above problems,an object 6D pose estimation algorithm combining feature fusion and attention mechanism is proposed.First of all,the Convolutional Block Attention Module is added to the first convolution module of the RGB image feature extraction network to improve the regional saliency of small objects with weak texture.Secondly,the skip connection based on Convolutional Block Attention Module is introduced into the RGB image feature extraction network based on the encoder-decoder structure,which effectively fuses the detailed appearance features containing color,texture and others in the coding stage into the pose semantic features in the decoding stage to make up for the lack of detailed appearance features in the pose semantic features.Then,the Channel Attention Module is used to improve the Pyramid Pooling Module to enhance the connection between the visible area of the target object and the occluded area,and improve the occlusion robustness.Finally,the Convolutional Block Attention Module is used to reconstruct the features in the decoding stage rich in pose semantic information,so as to enhance the discrimination of similar surface features,thus reducing the interference of similar appearance objects on object 6D pose estimation.The experimental results show that the ADD(-S)index of the algorithm on Occlusion LINEMOD dataset and LINEMOD dataset reaches 73.4%and 99.8%respectively,which are 7.8 percentage points and 0.1 percentage points higher than that of FFB6D respectively,verifying the feasibility of the algorithm.
作者
高维东
林琳
刘贤梅
赵娅
GAO Wei-dong;LIN Lin;LIU Xian-mei;ZHAO Ya(School of Computer and Information Technology,Northeast Petroleum University,Daqing 163318,China)
出处
《计算机技术与发展》
2023年第12期92-100,共9页
Computer Technology and Development
基金
黑龙江省教育科学“十四五”规划重点课题(GJB1421114)
黑龙江省自然科学基金项目(LH2020F003)
黑龙江省高等教育教学改革重点委托项目(SJGZ20200037)。