摘要
通过肉眼识别鱼类疫病依赖于诊断人员的经验,疫病数据存在类间差距较小与识别效率低等细粒度问题。由于Transformer缺乏卷积神经网络(CNN)的归纳偏差,需要大量的数据进行训练;CNN对全局特征提取不足,泛化性能较差等问题限制模型的分类精度。基于特征图对所有像素的全局交互建立算法模型,提出一种基于CNN与Vision Transformer相结合的鱼类疫病识别模型(CViT-FDRM)。首先,搭建鱼类疫病的数据库FishData01;其次,利用CNN提取鱼类图像细粒度特征,采用Transformer模型自注意力机制获取图像全局信息进行并行训练;然后,采用组归一化层将样本通道分组求均值与标准差;最后,采用404张鱼类疫病图像进行测试,CViT-FDRM达到97.02%的识别准确率。在细粒度图像开源数据库Oxford Flowers上的实验结果表明,CViT-FDRM的分类精度优于主流的细粒度图像分类算法,可达95.42%,提高4.84个百分点。CViT-FDRM在细粒度图像识别方面可达到较好的效果。
Identification of fish epidemics by the naked eye depends on the experience of diagnostic personnel.Moreover,the epidemic data has such fine granularity problems as small gaps between categories and low recognition efficiency.Usually,the Transformer requires a large amount of data for training due to the lack of inductive bias in convolutional neural networks(CNN).In addition,the model’s classification accuracy is restricted by insufficient global feature extraction and the weak generalization performance of CNN.In this study,based on the global interaction of all pixels in the feature map,an algorithm model is developed,and a fish epidemic recognition model(CViT-FDRM)using the combination of CNN and a Vision Transformer is suggested.First,FishData01,a database of fish epidemics,is set up.Second,CNN is used to extract the fine-grain features of fish images,and the Transformer model self-attention mechanism is used to acquire the global information of images for parallel training.Then,the group normalization layer is utilized to group the sample channels to compute the mean and standard deviation.Finally,404 fish epidemic images were used for testing,and CViT-FDRM obtained 97.02%recognition accuracy.The experimental results on Oxford Flowers,an open-source database of fine-grained images,reveal that CViT-FDRM has greater classification accuracy than that of the standard fine-grained image classification algorithm,reaching 95.42%,which is 4.84 percentage points higher.Therefore,CViT-FDRM can perform well in fine-grain image recognition.
作者
魏立明
赵奎
王宁
张忠岩
崔海朋
Wei Liming;Zhao Kui;Wang Ning;Zhang Zhongyan;Cui Haipeng(Department of Information Science and Engineering,Ocean University of China,Qingdao 266100,Shandong,China;Qingdao JARI Industrial Control Technology Co.,Ltd.,Qingdao 266071,Shandong,China)
出处
《激光与光电子学进展》
CSCD
北大核心
2023年第16期111-120,共10页
Laser & Optoelectronics Progress
基金
山东省重点研发计划(科技示范工程)(2021SFGC0701)。