摘要
从图像中获取目标物体的6D位姿信息在机器人操作和虚拟现实等领域有着广泛的应用,然而,基于深度学习的位姿估计方法在训练模型时通常需要大量的训练数据集来提高模型的泛化能力,一般的数据采集方法存在收集成本高同时缺乏3D空间位置信息等问题.鉴于此,提出一种低质量渲染图像的目标物体6D姿态估计网络框架.该网络中,特征提取部分以单张RGB图像作为输入,用残差网络提取输入图像特征;位姿估计部分的目标物体分类流用于预测目标物体的类别,姿态回归流在3D空间中回归目标物体的旋转角度和平移矢量.另外,采用域随机化方法以低收集成本方式构建大规模低质量渲染、带有物体3D空间位置信息的图像数据集Pose6DDR.在所建立的Pose6DDR数据集和LineMod公共数据集上的测试结果表明了所提出位姿估计方法的优越性以及大规模数据集域随机化生成数据方法的有效性.
The 6 D object pose obtained from single RGB image has broad applications such as robotic manipulation and virtual reality.However,the deep learning-based pose estimation methods usually require a large amount of trainging data to improve the generalization ability of the model,and in general,the common data generation methods have great challenges in high cost of data collection and lack of 3 D information.This paper proposes a 6 D object pose estimation network with low-quality rendering images.In this network,the feature extraction part takes a single RGB image as the input of the network,and uses the residual network to extract the features of this image.The classification stream of the pose estimation part predicts the category of the target object,and the regression stream returns the rotation angle and translation vector of the target object in 3 D space.Moreover,the domain randomization method is used to establish a large-scale low-quality rendering images with the 3 D spatial position information in a low collection cost.The experimental results on the established Pose6 DDR dataset and the public LineMod dataset verify the superiority of the proposed pose estimation method and the effectiveness of the established large-scale simulation dataset.
作者
左国玉
张成威
刘洪星
龚道雄
ZUO Guo-yi;ZHANG Cheng-wei;LIU Hong-xing;GONG Dao-xiong(Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China;Beijing Key Laboratory of Computing Intelligence and Intelligent Systems,Beijing 100124,China)
出处
《控制与决策》
EI
CSCD
北大核心
2022年第1期135-141,共7页
Control and Decision
基金
国家重点研发计划项目(2018YFB1307004)
国家自然科学基金项目(61873008)
北京市自然科学基金项目(4182008,4192010)。