摘要
可食用野菜兼具营养价值和药用价值,然而传统采摘可食用野菜的分辨主要依赖人为主观经验,效率低且错误风险高,因此对可食用野菜快速准确的识别对实现野菜产业开发和保障食用安全具有重要意义。以南京地区“七头一脑”共8种可食用野菜为研究对象,构建了8种野菜的2400张图像数据集,采用3种具有代表性的卷积神经网络(convolutional neural network,CNN)模型(AlexNet、VGG16和ResNet50)和3种视觉自注意力(vision transformer,ViT)模型(ViT、CaiT和DeiT)共6种不同的深度学习模型进行训练和验证,并通过梯度加权类激活映射(gradient-weighted class activation mapping,Grad-CAM)来分析深度学习模型的决策机制。结果表明,ResNet50在验证集上的准确率达到94.68%,精确率、召回值和F1分数分别为97.66%、97.74%和97.70%,在6个模型中表现最佳。随后,在最优模型ResNet50基础上添加卷积模块的注意力机制(convolutional block attention module,CBAM)和坐标注意力机制(coordinate attention,CA)模块进行模型优化,结果显示,CBAM-ResNet50准确率达到了97.67%,CA-ResNet50准确率达到了98.34%,分别提高了2.99个百分点和3.66个百分点。以上研究结果证实了CNN模型在数据集上能取得比ViT更好的结果,利用深度学习识别可食用野菜种类是可行的,且添加注意力模块能够实现更高的识别准确率。
Edible wild vegetables possess both nutritional and medicinal values.However,the traditional identification of wild edible vegetables mainly relies on subjective human experience,which is inefficient and carries a high risk of error.Therefore,rapid and accurate identification of edible wild vegetables is of great significance for the development of the wild vegetable industry and the assurance of food safety.Eight types of edible wild vegetables known as the"Seven Heads and One Brain"in the Nanjing region were selected as the research subjects and a database of 2400 images were constructed.Training and validation were conducted using 6 different deep learning models,including 3 representative convolutional neural network(CNN)models(AlexNet、VGG16 and ResNet50)and 3 vision transformers(ViT)models(ViT、CaiT and DeiT).Furthermore,the decision-making mechanisms of the deep learning models were analyzed using Gradient-Weighted Class Activation Mapping.The results showed that ResNet50 achieved an accuracy rate of 94.68%on the validation set,with precision,recall value,and F1-score of 97.66%,97.74%,and 97.70%,respectively,and performed the best among the 6 models.Subsequently,the attention mechanism modules,convolutional block attention module and coordinate attention module were added to the optimal ResNet50 model for further optimization.The results showed that the accuracy of CBAM-ResNet50 and CA-ResNet50 models achieved 97.67%and 98.34%,respectively,representing enhancements of 2.99 and 3.66 percent point.The above research results confirmed that the CNN model can achieve better results than ViT on the dataset in this paper.It is feasible to use deep learning to identify edible wild vegetable spe-cies,and adding attention modules can lead to higher recognition accuracy.
作者
吴玉强
孙荀
季呈明
胡乃娟
WU Yuqiang;SUN Xun;JI Chengming;HU Naijuan(College of Information Technology,Nanjing Police University,Nanjing 210023,Jiangsu,China;College of Engineering,Nanjing Agricultural University,Nanjing 210095,Jiangsu,China;Institute of Agricultural Economy and Development,Jiangsu Academy of Agricultural Sciences,Nanjing 210014,Jiangsu,China)
出处
《中国瓜菜》
CAS
北大核心
2024年第11期57-66,共10页
China Cucurbits And Vegetables
基金
江苏省重点研发计划项目(BE2019762)
中央高校基本科研业务费专项资金项目(LGZD202408)
国家自然科学基金(32201923)
“十四五”江苏省重点学科“公安技术”(苏教研函﹝2022﹞2号)。
关键词
可食用野菜
种类识别
卷积神经网络
视觉自注意力
注意力机制模块
Edible wild vegetables
Species identification
Convolutional neural networks
Vision transformer
Attention mechanism modules