Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such ...Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such as face recognition and self-driving cars.RetinaMask achieves significant progress in the field of one-stage detectors by adding a semantic segmentation branch,but it has limitation in detecting multi-scale objects.To solve this problem,this paper proposes RetinaMask with Gate(RMG)model,consisting of four main modules.It develops RetinaMask with a gate mechanism,which extracts and combines features at different levels more effectively according to the size of objects.It firstly extracted multi-level features from input image by ResNet.Secondly,it constructed a fused feature pyramid through feature pyramid network,then gate mechanism was employed to adaptively enhance and integrate features at various scales with the respect to the size of object.Finally,three prediction heads were added for classification,localization and mask prediction,driving the model to learn with mask prediction.The predictions of all levels were integrated during the post-processing.The augment network shows better performance in object detection without the increase of computation cost and inference time,especially for small objects.展开更多
In the complex orchard environment,the efficient and accurate detection of object fruit is the basic requirement to realize the orchard yield measurement and automatic harvesting.Sometimes it is hard to differentiate ...In the complex orchard environment,the efficient and accurate detection of object fruit is the basic requirement to realize the orchard yield measurement and automatic harvesting.Sometimes it is hard to differentiate between the object fruits and the background because of the similar color,and it is challenging due to the ambient light and camera angle by which the photos have been taken.These problems make it hard to detect green fruits in orchard environments.In this study,a two-stage dense to detection framework(D2D)was proposed to detect green fruits in orchard environments.The proposed model was based on multi-scale feature extraction of target fruit by using feature pyramid networks MobileNetV2+FPN structure and generated region proposal of target fruit by using Region Proposal Network(RPN)structure.In the regression branch,the offset of each local feature was calculated,and the positive and negative samples of the region proposals were predicted by a binary mask prediction to reduce the interference of the background to the prediction box.In the classification branch,features were extracted from each sub-region of the region proposal,and features with distinguishing information were obtained through adaptive weighted pooling to achieve accurate classification.The new proposed model adopted an anchor-free frame design,which improves the generalization ability,makes the model more robust,and reduces the storage requirements.The experimental results of persimmon and green apple datasets show that the new model has the best detection performance,which can provide theoretical reference for other green object detection.展开更多
基金the National Natural Science Foundation of China under Grant No.61672181。
文摘Detection efficiency plays an increasingly important role in object detection tasks.One-stage methods are widely adopted in real life because of their high efficiency especially in some real-time detection tasks such as face recognition and self-driving cars.RetinaMask achieves significant progress in the field of one-stage detectors by adding a semantic segmentation branch,but it has limitation in detecting multi-scale objects.To solve this problem,this paper proposes RetinaMask with Gate(RMG)model,consisting of four main modules.It develops RetinaMask with a gate mechanism,which extracts and combines features at different levels more effectively according to the size of objects.It firstly extracted multi-level features from input image by ResNet.Secondly,it constructed a fused feature pyramid through feature pyramid network,then gate mechanism was employed to adaptively enhance and integrate features at various scales with the respect to the size of object.Finally,three prediction heads were added for classification,localization and mask prediction,driving the model to learn with mask prediction.The predictions of all levels were integrated during the post-processing.The augment network shows better performance in object detection without the increase of computation cost and inference time,especially for small objects.
基金the Natural Science Foundation of Shandong Province in China(Grant No.ZR2020MF076)the Focus on Research and Development Plan in Shandong Province(Grant No.2019GNC106115)+2 种基金the National Nature Science Foundation of China(Grant No.62072289)the Shandong Province Higher Educational Science and Technology Program(Grant No.J18KA308)the Taishan Scholar Program of Shandong Province of China.
文摘In the complex orchard environment,the efficient and accurate detection of object fruit is the basic requirement to realize the orchard yield measurement and automatic harvesting.Sometimes it is hard to differentiate between the object fruits and the background because of the similar color,and it is challenging due to the ambient light and camera angle by which the photos have been taken.These problems make it hard to detect green fruits in orchard environments.In this study,a two-stage dense to detection framework(D2D)was proposed to detect green fruits in orchard environments.The proposed model was based on multi-scale feature extraction of target fruit by using feature pyramid networks MobileNetV2+FPN structure and generated region proposal of target fruit by using Region Proposal Network(RPN)structure.In the regression branch,the offset of each local feature was calculated,and the positive and negative samples of the region proposals were predicted by a binary mask prediction to reduce the interference of the background to the prediction box.In the classification branch,features were extracted from each sub-region of the region proposal,and features with distinguishing information were obtained through adaptive weighted pooling to achieve accurate classification.The new proposed model adopted an anchor-free frame design,which improves the generalization ability,makes the model more robust,and reduces the storage requirements.The experimental results of persimmon and green apple datasets show that the new model has the best detection performance,which can provide theoretical reference for other green object detection.