期刊文献+

基于Transformer的强泛化苹果叶片病害识别模型 被引量:11

Model for identifying strong generalization apple leaf disease using Transformer
下载PDF
导出
摘要 模型泛化能力是病害识别模型多场景应用的关键,该研究针对不同环境下的苹果叶片病害数据,提出一种可以提取多类型特征的强泛化苹果叶片病害识别模型CaTNet。该模型采用双分支结构,首先设计了一种卷积神经网络分支,负责提取苹果叶片图像的局部特征,其次构建了具有挤压和扩充功能的视觉Transformer分支,该分支能够提取苹果叶片图像的全局特征,最后将两种特征进行融合,使Transformer分支可以学习局部特征,使卷积神经网络分支学习全局特征。与多种卷积神经网络模型和Transformer模型相比,该模型具有更好的泛化能力,仅需学习实验室环境叶片数据,即可在自然环境数据下达到80%的识别精度,相较卷积神经网络EfficientNetV2的72.14%精度和Transformer网络PVT的52.72%精度均有较大提升,能够有效提升对不同环境数据的识别精度,解决了深度学习模型训练成本高,泛化能力弱的问题。 Apple diseases have pose a serious risk on the income of orchards in recent years.An accurate and rapid identification of apple diseases can be great benefit to better prevent and control diseases.Most effort has been made in the laboratory to train the identification model,due mainly to the limited condition for the deliberately infect apples in the real orchard.However,most models cannot fully meet the requirement of the disease detection in the large-scale production.In this study,a deep learning model(called CaTNet)was proposed to extract both the global and local information from the diseases of apple leaf.The image data of disease was collected from the apple orchards in the Jilin Province of China.A total of 16,464 images were obtained from the several publicly available datasets with the laboratory and natural environmental data collected from the field.Firstly,a model structure was constructed with both Transformer and convolutional neural network(CNN).Global and local information was extracted from the original images using the two branches.The strong generalization ability of the model was improved to learn a wider variety of features.Meanwhile,the global features were acquired to improve the resistance of the model to interference.Secondly,the Transformer block in the Transformer branch was optimized to make the structure simpler.In addition,a channel compression and expansion module was designed in the Transformer branch,in order to reduce the training cost of CaTNet for the less channel dimension of the input features.Afterwards,the multiple multilayer perceptrons were replaced by the grouped convolutional layers to further improve the computational speed of the model.Thirdly,the lightweight CNN branch was constructed with an inverse residual structure to fuse the point convolution of the expanded channels with the 3×3 convolution of the extracted information.The CNN branch was utilized to extract the local features of the image.As such,the model was more sensitive to the fine-grained features.Finally,the concat operation was implemented to fuse the different output of features from the two branches.After that,the CNN branch was selected to extract the local features from the global ones,whereas,the Transformer branch was extracted the global from the local.The multiple features to be cycled were also improved the generalization of the model.A comparison was made to clarify the effect of different down-sampling on the two-branch network.Specifically,an accuracy rate of 79.35%,74.06%and 67.95%were obtained using pooling,3×3 size convolution kernel,and 1×1 size convolution kernel for the down-sampling,respectively.The CaTNet model with two branches showed a computational speed of 0.1082 s/Frame),which was faster than the various deep learning models,such as the EfficientNetV2 s(0.3832 s/Frame)and PVT t(0.1778 s/Frame).Consequently,the two-branch structure can be expected to accommodate more computation for the much higher computational speed.This finding can provide a design approach to build the deep learning models with the high generalization capability,particularly on the training with the high accuracy under only easily accessible data.
作者 徐艳蕾 孔朔琳 陈清源 高志远 李陈孝 Xu Yanlei;Kong Shuolin;Chen Qingyuan;Gao Zhiyuan;Li Chenxiao(College of Information Technology,Jilin Agricultural University,Changchun 130118,China)
出处 《农业工程学报》 EI CAS CSCD 北大核心 2022年第16期198-206,共9页 Transactions of the Chinese Society of Agricultural Engineering
基金 吉林省科技厅国际科技合作项目(20200801014GH) 长春市科技局重点科技攻关项目(21ZGN28)。
关键词 图像识别 农业 卷积神经网络 苹果叶片病害 Transformer模型 强泛化性 特征融合 image identification agriculture convolutional neural networks apple leaf disease Transformer model strong generalization ability feature fusion
  • 相关文献

参考文献7

二级参考文献73

共引文献124

同被引文献119

引证文献11

二级引证文献26

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部