摘要
Edible mushrooms are rich in nutrients;however,harvesting mainly relies on manual labor.Coarse localization of each mushroom is necessary to enable a robotic arm to accurately pick edible mushrooms.Previous studies used detection algorithms that did not consider mushroom pixel-level information.When these algorithms are combined with a depth map,the information is lost.Moreover,in instance segmentation algorithms,convolutional neural network(CNN)-based methods are lightweight,and the extracted features are not correlated.To guarantee real-time location detection and improve the accuracy of mushroom segmentation,this study proposed a new spatial-channel transformer network model based on Mask-CNN(SCT-Mask-RCNN).The fusion of Mask-RCNN with the self-attention mechanism extracts the global correlation outcomes of image features from the channel and spatial dimensions.Subsequently,Mask-RCNN was used to maintain a lightweight structure and extract local features using a spatial pooling pyramidal structure to achieve multiscale local feature fusion and improve detection accuracy.The results showed that the SCT-Mask-RCNN method achieved a segmentation accuracy of 0.750 on segm_Precision_mAP and detection accuracy of 0.638 on Bbox_Precision_mAP.Compared to existing methods,the proposed method improved the accuracy of the evaluation metrics Bbox_Precision_mAP and segm_Precision_mAP by over 2%and 5%,respectively.
基金
supported by China Agriculture Research System of MOF and MARA(CARS-20)
Zhejiang Provincial Key Laboratory of Agricultural Intelligent Equipment and Robotics Open Fund(2023ZJZD2301)
Chinese Academy of Agricultural Science and Technology Innovation Project“Fruit And Vegetable Production And Processing Technical Equipment Team”(2024)
Beijing Nova Program(20220484023).