杂乱场景下小物体抓取检测研究

Small object grasping detection in cluttered scenes

导出

摘要目的杂乱场景下的物体抓取姿态检测是智能机器人的一项基本技能。尽管六自由度抓取学习取得了进展,但先前的方法在采样和学习中忽略了物体尺寸差异,导致在小物体上抓取表现较差。方法提出了一种物体掩码辅助采样方法,在所有物体上采样相同的点以平衡抓取分布,解决了采样点分布不均匀问题。此外,学习时采用多尺度学习策略,在物体部分点云上使用多尺度圆柱分组以提升局部几何表示能力,解决了由物体尺度差异导致的学习抓取操作参数困难问题。通过设计一个端到端的抓取网络,嵌入了提出的采样和学习方法,能够有效提升物体抓取检测性能。结果在大型基准数据集GraspNet-1Billion上进行评估,本文方法取得对比方法中的最优性能,其中在小物体上的抓取指标平均提升了7%,大量的真实机器人实验也表明该方法具有抓取未知物体的良好泛化性能。结论本文聚焦于小物体上的抓取,提出了一种掩码辅助采样方法嵌入到提出的端到端学习网络中,并引入了多尺度分组学习策略提高物体的局部几何表示,能够有效提升在小尺寸物体上的抓取质量,并在所有物体上的抓取评估结果都超过了对比方法。 Objective Object grasp pose detection in cluttered scenes is an essential skill for intelligent robots.Despiterecent advances in six degrees-of-freedom grasping learning,learning the grasping configuration of small objects isextremely challenging.First,given the huge amount of raw point cloud data,points in the scene should be downsampled toreduce the computational complexity of the network and increase detection efficiency.Meanwhile,previous sampling meth⁃ods sample fewer points on small objects,leading to difficulties in learning small object grasping poses.In addition,consumer-grade depth cameras currently available in the market are seriously noisy,particularly because the quality ofpoint clouds obtained on small objects cannot be guaranteed,leading to the possibility of unclear objecthood of points onsmall objects predicted by the network.Some feasible grasping points are mistakenly regarded as background points,fur⁃ther reducing the number of sampling points on small objects,resulting in weak grasping performance on small objects.Method A potential problem in previous grasp detection methods is that they do not consider the biased distribution of sam⁃pling points due to differences in the scale of objects in the scene,resulting in fewer sampling points on small objects.Inthis study,we propose an object mask-assisted sampling method that samples the same points on all objects to balance grasping distribution,solving the problem of the uneven distribution of sampling points.In the inference,without a prioriknowledge of scene point-level masks,we introduce an unseen object instance segmentation network to distinguish objectsin the scenario,implementing a mask-assisted sampling method.In addition,a multi-scale learning strategy is used forlearning,and multi-scale cylindrical grouping is used on the partial point clouds of objects to improve local geometric repre⁃sentation,solving the problem of difficulty in learning to grasp operational parameters caused by differences in objectscales.In particular,we set up three cylinders with different radii to sample the point cloud near the graspable point,corre⁃sponding to learning large,medium,and small object features,and then splice the features of the three scales.Subse⁃quently,we process the spliced features with a self-attention layer to enhance the attention of the local region and improvethe local geometric representation of the object.Similar to GraspNet,we design an end-to-end grasping network that con⁃sists of three parts:graspable points,approach direction,and prediction of gripper operation.Graspable points representthe high-scoring points in the scene that are suitable for grasping.They can perform the initial filtering of a large amount ofpoint cloud data in the scene and then embedded into the proposed sampling and learning methods to further predict theapproach direction and gripper operation for grasping poses on an object.By designing an end-to-end grasping networkembedded with the proposed sampling and learning approach,we can effectively improve object grasping detection capabil⁃ity.Result Finally,the proposed method achieves state-of-the-art performance when evaluated on the large benchmarkdataset GraspNet-1Billion,wherein the grasping metrics on small objects are improved by 7%on average,and a large num⁃ber of real robot experiments also show that the approach exhibits promising generalization performance on unseen objects.To more intuitively observe the improvement of the grasping performance of the proposed method on small objects,we alsouse the previous most representative method,i.e.,graspness-based sampling network(GSNet),as the benchmark methodand visualize the grasping detection results of the benchmark method and the proposed method in this study under four clut⁃tered scenarios.The visualization results show that the previous method tends to predict grasping on large objects in thescene but does not show reasonable grasping poses on some small objects.By contrast,the proposed method can accuratelypredict grasping poses on small objects.Conclusion Focusing on grasping small objects,this study proposes a maskassisted sampling method embedded into the proposed end-to-end learning network and introduces a multi-scale groupinglearning strategy to improve the local geometric representation of objects,effectively improving the quality of grasping smallobjects and outperforming previous methods in the evaluation of grasping all objects.However,the proposed method hascertain limitations.For example,when using noisy and low-quality depth maps as input,existing unseen object instancesegmentation methods may produce incorrect object masks,failing in mask-assisted sampling.In the future,we plan toinvestigate more robust unseen object instance segmentation methods that can correct erroneous segmentation results underlow-quality depth map input.This procedure will allow us to obtain more accurate object instance masks and enhanceobject grasping detection capability in cluttered scenes.

作者孙国栋贾俊杰李明晶张杨 Sun Guodong;Jia Junjie;Li Mingjing;Zhang Yang(School of Mechanical Engineering,Hubei University of Technology,Wuhan 430068,China)

机构地区湖北工业大学机械工程学院

出处《中国图象图形学报》 CSCD 北大核心 2024年第2期468-477,共10页 Journal of Image and Graphics

基金国家自然科学基金项目(51775177)。

关键词六自由度抓取采样策略多尺度学习点云学习深度学习 six degrees-of-freedom grasping sampling strategy multiscale learning point cloud learning deep learning

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献2

1Meng-Hao Guo,Jun-Xiong Cai,Zheng-Ning Liu,Tai-Jiang Mu,Ralph R.Martin,Shi-Min Hu.PCT:Point cloud transformer[J].Computational Visual Media,2021,7(2):187-199. 被引量：94
2闫明,陶大鹏,普园媛.面向工业零件分拣系统的低纹理目标检测[J].中国图象图形学报,2022,27(8):2418-2429. 被引量：5

二级参考文献2

1赵永强,饶元,董世鹏,张君毅.深度学习目标检测方法综述[J].中国图象图形学报,2020,25(4):629-654. 被引量：202
2Shi-Min HU,Dun LIANG,Guo-Ye YANG,Guo-Wei YANG,Wen-Yang ZHOU.Jittor:a novel deep learning framework with meta-operators and unified graph execution[J].Science China(Information Sciences),2020,63(12):114-134. 被引量：15

共引文献97

1ZHANG Ying,SUN Yue,WU Lin,ZHANG Lulu,MENG Bumin.3D Point Cloud Semantic Segmentation Based PAConv and SE_variant[J].Instrumentation,2023,10(4):27-38.
2钟侠骄,张绍兵,郭静,王胜朝,成苗,何莲,赵铱民.基于RandLA-Net的3D点云牙颌分割与身份识别[J].计算机应用,2023,43(S01):269-275.
3Meng-Hao Guo,Zheng-Ning Liu,Tai-Jiang Mu,Dun Liang,Ralph R.Martin,Shi-Min Hu.Can attention enable MLPs to catch up with CNNs?[J].Computational Visual Media,2021,7(3):283-288. 被引量：1
4Hao-Xuan Song,Jiahui Huang,Yan-Pei Cao,Tai-Jiang Mu.HDR-Net-Fusion:Real-time 3D dynamic scene reconstruction with a hierarchical deep reinforcement network[J].Computational Visual Media,2021,7(4):419-435. 被引量：1
5高金金,李潞洋.一种改进的点云Transformer深度学习模型[J].中北大学学报（自然科学版）,2021,42(6):515-523. 被引量：4
6孙刘杰,赵进,王文举,张煜森.多尺度Transformer激光雷达点云3D物体检测[J].计算机工程与应用,2022,58(8):136-146. 被引量：2
7曾安,彭杰威,刘畅,潘丹,蒋艳荣,张小波.基于多尺度几何感知Transformer的植物点云补全网络[J].农业工程学报,2022,38(4):198-205. 被引量：5
8Shi-Min Hu.Message from the Editor-in-Chief[J].Computational Visual Media,2022,8(1):1-1.
9Yifan Xu,Huapeng Wei,Minxuan Lin,Yingying Deng,Kekai Sheng,Mengdan Zhang,Fan Tang,Weiming Dong,Feiyue Huang,Changsheng Xu.Transformers in computational visual media:A survey[J].Computational Visual Media,2022,8(1):33-62. 被引量：11
10李娇娇,孙红岩,董雨,张若晗,孙晓鹏.基于深度学习的3维点云处理综述[J].计算机研究与发展,2022,59(5):1160-1179. 被引量：9

1赵梦瑶,朱建军.基于三维点云的机械臂抓取位姿检测方法[J].吉林化工学院学报,2023,40(11):54-60.
2丁辉,胡明华,尹嘉男.多尺度层级金字塔网络的无人机入侵检测方法[J].航空计算技术,2024,54(1):37-40.
3Garrett Hart.被骗风波[J].空中英语教室（初级版．大家说英语）,2024(3):21-22.
4廖耀青,谢锡贵,胡俊,杨益彰.基于分组学习的“教学训研”课堂共同体模式探索与实践——以“编队飞行技术”课程为例[J].广东职业技术教育与研究,2023(7):114-117.
5肖潇,何其文,李晶辉.职业教育制造类专业因材施教的路径研究[J].精品生活,2023(19):0100-0102.
6杨丽新.基于案例与实践的机器人研究型教学模式探索[J].锻压装备与制造技术,2024,59(1):130-133.
7李宏建.终生和急性应激预测卒中后功能转归[J].国际脑血管病杂志,2023,31(11):880-880.
8章琪(编译),韩一龙(编译).观点分裂现象中的定律[J].物理,2024,53(2):114-117.
9任冯天雨.高校体育舞蹈教学中的协同学习模式探索与实践[J].冰雪体育创新研究,2023(22):107-109.
10廖诣深,于乃功.仿鼠脑内嗅-海马-前额叶信息传递回路的空间导航方法[J].生物医学工程学杂志,2024,41(1):80-89.

中国图象图形学报

2024年第2期

浏览历史

内容加载中请稍等...

杂乱场景下小物体抓取检测研究

参考文献2

二级参考文献2

共引文献97

相关作者

相关机构

相关主题

浏览历史