期刊文献+

基于注意力机制的树木叶片分类识别方法研究

Research on Classification and Recognition Method of Tree Leaves Based on Attention Mechanism
下载PDF
导出
摘要 本文将注意力机制分类模型——Vision Transformer(ViT)应用于树种分类识别任务,旨在探索更高精度和更高效率的树种识别模型。本研究共设计了三组对比实验:(1)用ViT和ResNet50在实验环境的数据集上进行训练、验证和测试,(2)为Vi T模型设置不同的深度进行训练,(3)用ViT和ResNet50在真实环境的数据集上进行训练、验证和测试。结果表明,无论是实验环境的数据集,还是真实环境的数据集,ViT模型都达到了与ResNet50模型相当的分类性能,并且ViT模型的时间效率明显优于ResNet50。此外,本研究还展示了Vi T对真实环境的图像进行分类时的类激活热力图,发现ViT模型更关注树叶本身尤其是树叶边缘而忽略了复杂的背景,从而有效提高了分类精度。结果说明,两个模型分类精度相当,但ViT的收敛速度明显更快,学习特征的能力更强,泛化能力也更强。本研究是将ViT应用在树种分类识别这一具体任务上的一次有益尝试,为后续融合ViT与CNN优势,以更高的效率、更小的数据需求、在更复杂的高原林业数据集上进行树种识别研究奠定基础。 In this paper,a typical model of attention mechanism in image classification tasks,Vision Transformer(ViT)is applied to the task of tree species classification and recognition,aiming to explore a more accurate and efficient tree species recognition model.A total of three sets of comparative experiments are designed in this paper:(1)ViT and ResNet50 are used for training,validation and testing on the dataset in the experimental environment,(2)ViT model is set to different depths for training,(3)ViT and ResNet50 are used in real environment training,validation and testing on the dataset.The results showed that the classification performance of the ViT model is the same as that of the ResNet50 model,whether it was the experimental environment dataset or the real environment dataset,and the time efficiency of the ViT model is significantly better than that of the ResNet 50 model.In addition,this paper also shows the class activation heat map when classifying images of the real environment.It is found that the ViT model pays more attention to the leaves themselves,especially the leaf edges,while ignoring the complex background.The two models are comparable in classification accuracy,but ViT has significantly faster convergence speed,stronger ability to learn features,and stronger generalization ability.By reducing the network depth,the time efficiency of ViT is further improved.This study is a useful attempt to apply ViT to the specific task of tree species classification and identification.It also lays the foundation for the subsequent research on tree species identification with higher efficiency,smaller data requirements,and real-world datasets of Plateau Forestry by integrating the advantages of ViT and CNN.
作者 赵新瑞 张雯悦 徐竞怡 闫飞 Zhao Xinrui;Zhang Wenyue;Xu Jingyi;Yan Fei(Linzhi Natural Resources Bureau;Resource&Environment College,Xizang Agricultural and Animal Husbandry University,Linzhi Xizang,860000,China;Beijing Key Laboratory of Precision Forestry,Beijing Forestry University,Beijing 100083,China)
出处 《高原农业》 2024年第4期393-403,共11页 Journal of Plateau Agriculture
基金 西藏自治区科技厅中央引导地方项目(XZ202301YD0043C),多源激光雷达结合数字孪生的藏东南优势树种固碳能力研究。
关键词 树种识别 注意力机制 卷积神经网络 可视化 Tree species identification Attention mechanism Convolutional neural network Visualization
  • 相关文献

参考文献4

二级参考文献16

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部