摘要
为了解决目前机器人在物体被遮挡以及光照不足的环境下难以实现精准6DoF姿态估计的问题,提出了一个基于像素级特征融合的神经网络框架。该框架包含三个模块,分别为RGB特征提取网络模块、像素融合结构模块以及6D姿态回归网络模块。其中RGB特征提取网络主要用于分割目标物体并进行特征的提取;像素融合结构负责将RGB特征和三维多视角特征进行融合;最后一个模块将三维点云像素进行融合,并输出物体6D姿态结构。通过在YCB-Video数据集、LINEMOD数据集以及处理后的YCB-Occlusion数据集上的实验证明,所提出的像素级融合网络能在物体被遮挡以及物体点云数据丢失等情况下有效预测出物体的6D姿态,并且其计算效率在损失少量精确度的情况下比其他网络提高了上百倍,且具有较强的鲁棒性。
In order to solve the problem that current robots are difficult to achieve accurate 6DoF pose estimation under the environment of occluded objects and insufficient lighting,in this paper,a pixel-level based neural network framework is proposed,which includes three modules,the RGB feature extraction networks module,the pixel-level fusion module and the 6D pose regression network module.The RGB feature extraction networks module firstly segments the target objects and then extracts the objects features.The pixel-level fusion module is applied for fusing RGB features with 3D multi-view features.And the last module fuses 3D point cloud pixels and outputs the 6D pose of the objects.The experiments conducted on the YCB-Video dataset,the LINEMOD dataset,and the YCB-Occlusion dataset processed in this paper manifest that the framework proposed can effectively predict the 6D pose of the objects even when the objects are occluded or the point clouds of the object are lost.Furthermore,compared with other frameworks,this framework is more robust and the efficiency is improved by hundreds of times only with a small loss of accuracy.
作者
梁达勇
陈俊洪
朱展模
黄可思
刘文印
LIANG Dayong;CHEN Junhong;ZHU Zhanmo;HUANG Kesi;LIU Wenyin(School of Computer Science,Guangdong University of Technology,Guangzhou 510006,China)
出处
《计算机科学与探索》
CSCD
北大核心
2020年第12期2072-2082,共11页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金,Nos.91748107,61703109
广东省引进创新科研团队计划项目,No.2014ZT05G157
广东省科技创新战略专项资金,No.pdjh2020a0173。
关键词
像素级融合
卷积神经网络
点云特征融合
6DoF姿态估计
pixel-level fusion
convolutional neural networks(CNN)
point cloud feature fusion
6DoF pose estimation