To balance the inference speed and detection accuracy of a grasp detection algorithm,which are both important for robot grasping tasks,we propose an encoder–decoder structured pixel-level grasp detection neural netwo...To balance the inference speed and detection accuracy of a grasp detection algorithm,which are both important for robot grasping tasks,we propose an encoder–decoder structured pixel-level grasp detection neural network named the attention-based efficient robot grasp detection network(AE-GDN).Three spatial attention modules are introduced in the encoder stages to enhance the detailed information,and three channel attention modules are introduced in the decoder stages to extract more semantic information.Several lightweight and efficient DenseBlocks are used to connect the encoder and decoder paths to improve the feature modeling capability of AE-GDN.A high intersection over union(IoU)value between the predicted grasp rectangle and the ground truth does not necessarily mean a high-quality grasp configuration,but might cause a collision.This is because traditional IoU loss calculation methods treat the center part of the predicted rectangle as having the same importance as the area around the grippers.We design a new IoU loss calculation method based on an hourglass box matching mechanism,which will create good correspondence between high IoUs and high-quality grasp configurations.AEGDN achieves the accuracy of 98.9%and 96.6%on the Cornell and Jacquard datasets,respectively.The inference speed reaches 43.5 frames per second with only about 1.2×10^(6)parameters.The proposed AE-GDN has also been deployed on a practical robotic arm grasping system and performs grasping well.Codes are available at https://github.com/robvincen/robot_gradet.展开更多
Grasp detection is a visual recognition task where the robot makes use of its sensors to detect graspable objects in its environment.Despite the steady progress in robotic grasping,it is still difficult to achieve bot...Grasp detection is a visual recognition task where the robot makes use of its sensors to detect graspable objects in its environment.Despite the steady progress in robotic grasping,it is still difficult to achieve both real-time and high accuracy grasping detection.In this paper,we propose a real-time robotic grasp detection method,which can accurately predict potential grasp for parallel-plate robotic grippers using RGB images.Our work employs an end-to-end convolutional neural network which consists of a feature descriptor and a grasp detector.And for the first time,we add an attention mechanism to the grasp detection task,which enables the network to focus on grasp regions rather than background.Specifically,we present an angular label smoothing strategy in our grasp detection method to enhance the fault tolerance of the network.We quantitatively and qualitatively evaluate our grasp detection method from different aspects on the public Cornell dataset and Jacquard dataset.Extensive experiments demonstrate that our grasp detection method achieves superior performance to the state-of-the-art methods.In particular,our grasp detection method ranked first on both the Cornell dataset and the Jacquard dataset,giving rise to the accuracy of 98.9%and 95.6%,respectively at realtime calculation speed.展开更多
基金supported by the National Natural Science Foundation of China(No.92048205)the China Scholarship Council(No.202008310014)。
文摘To balance the inference speed and detection accuracy of a grasp detection algorithm,which are both important for robot grasping tasks,we propose an encoder–decoder structured pixel-level grasp detection neural network named the attention-based efficient robot grasp detection network(AE-GDN).Three spatial attention modules are introduced in the encoder stages to enhance the detailed information,and three channel attention modules are introduced in the decoder stages to extract more semantic information.Several lightweight and efficient DenseBlocks are used to connect the encoder and decoder paths to improve the feature modeling capability of AE-GDN.A high intersection over union(IoU)value between the predicted grasp rectangle and the ground truth does not necessarily mean a high-quality grasp configuration,but might cause a collision.This is because traditional IoU loss calculation methods treat the center part of the predicted rectangle as having the same importance as the area around the grippers.We design a new IoU loss calculation method based on an hourglass box matching mechanism,which will create good correspondence between high IoUs and high-quality grasp configurations.AEGDN achieves the accuracy of 98.9%and 96.6%on the Cornell and Jacquard datasets,respectively.The inference speed reaches 43.5 frames per second with only about 1.2×10^(6)parameters.The proposed AE-GDN has also been deployed on a practical robotic arm grasping system and performs grasping well.Codes are available at https://github.com/robvincen/robot_gradet.
基金supported by the National Key Research and Development Program of China under Grant No.2018AAA010-3002the National Natural Science Foundation of China under Grant Nos.62172392,61702482 and 61972379.
文摘Grasp detection is a visual recognition task where the robot makes use of its sensors to detect graspable objects in its environment.Despite the steady progress in robotic grasping,it is still difficult to achieve both real-time and high accuracy grasping detection.In this paper,we propose a real-time robotic grasp detection method,which can accurately predict potential grasp for parallel-plate robotic grippers using RGB images.Our work employs an end-to-end convolutional neural network which consists of a feature descriptor and a grasp detector.And for the first time,we add an attention mechanism to the grasp detection task,which enables the network to focus on grasp regions rather than background.Specifically,we present an angular label smoothing strategy in our grasp detection method to enhance the fault tolerance of the network.We quantitatively and qualitatively evaluate our grasp detection method from different aspects on the public Cornell dataset and Jacquard dataset.Extensive experiments demonstrate that our grasp detection method achieves superior performance to the state-of-the-art methods.In particular,our grasp detection method ranked first on both the Cornell dataset and the Jacquard dataset,giving rise to the accuracy of 98.9%and 95.6%,respectively at realtime calculation speed.