摘要
This study aimed to address the challenge of accurately and reliably detecting tomatoes in dense planting environments,a critical prerequisite for the automation implementation of robotic harvesting.However,the heavy reliance on extensive manually annotated datasets for training deep learning models still poses significant limitations to their application in real-world agricultural production environments.To overcome these limitations,we employed domain adaptive learning approach combined with the YOLOv5 model to develop a novel tomato detection model called as TDA-YOLO(tomato detection domain adaptation).We designated the normal illumination scenes in dense planting environments as the source domain and utilized various other illumination scenes as the target domain.To construct bridge mechanism between source and target domains,neural preset for color style transfer is introduced to generate a pseudo-dataset,which served to deal with domain discrepancy.Furthermore,this study combines the semi-supervised learning method to enable the model to extract domain-invariant features more fully,and uses knowledge distillation to improve the model's ability to adapt to the target domain.Additionally,for purpose of promoting inference speed and low computational demand,the lightweight FasterNet network was integrated into the YOLOv5's C3 module,creating a modified C3_Faster module.The experimental results demonstrated that the proposed TDA-YOLO model significantly outperformed original YOLOv5s model,achieving a mAP(mean average precision)of 96.80%for tomato detection across diverse scenarios in dense planting environments,increasing by 7.19 percentage points;Compared with the latest YOLOv8 and YOLOv9,it is also 2.17 and 1.19 percentage points higher,respectively.The model's average detection time per image was an impressive 15 milliseconds,with a FLOPs(floating point operations per second)count of 13.8 G.After acceleration processing,the detection accuracy of the TDA-YOLO model on the Jetson Xavier NX development board is 90.95%,the mAP value is 91.35%,and the detection time of each image is 21 ms,which can still meet the requirements of real-time detection of tomatoes in dense planting environment.The experimental results show that the proposed TDA-YOLO model can accurately and quickly detect tomatoes in dense planting environment,and at the same time avoid the use of a large number of annotated data,which provides technical support for the development of automatic harvesting systems for tomatoes and other fruits.
为解决果实检测模型在密植环境中对于不同场景适应力较差和严重的数据依赖性问题,该研究结合YOLOv5模型和域自适应学习,提出了一种番茄域自适应检测模型TDA-YOLO(tomato detection domain adaptation)。该研究将密集种植环境中正常光照场景作为源域,其他光照场景作为目标域。首先,引入神经预设的颜色风格迁移来构建伪数据集,减小源域和目标域之间的差异。其次,该研究结合半监督学习方法,使模型能够更充分的提取域不变特征,并利用知识蒸馏提高模型适应目标域能力。此外还引入Faster Net轻量级网络整合到C3模块中,以加快推理速度并减少参数量。试验结果表明,在不同场景的密集种植环境中,TDA-YOLO模型检测番茄的均值平均精度为96.80%,比原始YOLOv5s模型提高了7.19个百分点,相较于YOLOv8和YOLOv9分别高出2.17和1.19个百分点,其对于每张图像的平均检测时间为15 ms,FLOPs大小为13.8G。经过加速处理后,Jetson Xavier NX开发板上部署的TDA-YOLO模型的检测准确率为90.95%,均值平均精度值为91.35%,每张图像的检测时间为21 ms,满足密植环境下番茄实时检测的要求。试验结果表明提出的TDA-YOLO模型可在密植环境下准确、快速的检测番茄,同时避免了使用大量的标注数据,为番茄等果实自动化收获系统的开发提供技术支持。
出处
《农业工程学报》
EI
CAS
CSCD
北大核心
2024年第13期134-145,共12页
Transactions of the Chinese Society of Agricultural Engineering
基金
The National Natural Science Foundation of China (32371993)
The Natural Science Research Key Project of Anhui Provincial University(2022AH040125&2023AH040135)
The Key Research and Development Plan of Anhui Province (202204c06020022&2023n06020057)。