基于ConvNeXt的北京地区红外相机野生动物图像识别改进模型构建

Wildlife Image Recognition of Infrared Cameras in Beijing Area Based on an Improvement ConvNeXt Model

下载PDF

导出

摘要【目的】针对红外相机拍摄的野生动物图像数据量大、无效图像占比多、图像背景复杂等问题,提出一种可对图像进行自动、高准确率识别的模型,为生物多样性研究和野生动物保护工作提供更高效的支持。【方法】收集整理近4年来北京园林绿化生态系统监测网络各站点红外相机拍摄的约5 TB图像数据,对其手工标注并进行数据增强后自建10类共4234张图像数据集。基于ConvNeXt卷积神经网络,结合北京地区野生动物图像数据集特点,设计BSGG-ConvNeXt模型,使用BlurPool、SENet、全局响应归一化层(GRN)、GCNet提升模型识别能力,并在自建数据集上探究训练策略对ConvNeXt网络识别准确率的影响,通过与其他经典模型比较,明确BSGG-ConvNeXt模型的优势。利用公开的红外野生动物Snapshot Serengeti(SS)数据集和Caltech Camera Traps(CCT)数据集,验证模型的泛化能力。【结果】以ConvNeXt的ConvNeXt-T网络尺寸模型为例,其在自建数据集中的准确率为74.13%,乘加累积操作数(MACs)为4.47×10^(9)。应用不同改进方案发现,使用BlurPool后准确率提升2.2%,MACs降至1.07×10^(9);使用SENet后准确率提升3.2%;使用GRN并删掉缩放层后准确率升至87.18%,参数数量增至27.88×10^(6);使用GCNet后在不增大计算量的情况下准确率升至75.44%,但参数数量增至28.25×10^(6)。将上述改进方案结合得到的BSGGConvNeXt应用于ConvNeXt-T模型获得BSGG-ConvNeXt-T模型,参数数量虽有少量增多,但MACs降为1.07×10^(9),模型准确率升至83.63%,高于原模型。使用预训练权重后的BSGG-ConvNeXt-T模型准确率可达94.07%,高于ResNet-50(76.39%)、ResNeXt-50(87.60%)、MobileViT(90.00%)、DenseNet(87.66%)、RegNet(69.90%)、ConvNeXtv2(91.93%)、SwinTransformer的(86.23%)和MobileOne(71.53%),将BSGG-ConvNeXt模型应用于4种不同网络尺寸的ConvNeXt模型后,在自建数据集中的表现均优于未改进模型。BSGG-ConvNeXt模型在SS数据集中的识别准确率达50.28%,在CCT数据集中的识别准确率达56.15%,均高于原模型的准确率。【结论】BSGG-ConvNeXt模型识别红外相机拍摄的野生动物图像准确率更高,在自建、公开的野生动物红外图像数据集上均有较好表现,且具有一定泛化能力。【Objective】Aiming at the problems of large amount of data,high proportion of invalid images,and complex image backgrounds in wild animal images captured by infrared cameras,a model that can automatically and accurately recognize images is proposed,providing more efficient support for biodiversity research and wildlife conservation work.【Method】Collect and organize approximately 5 TB of image data captured by infrared cameras at various stations of the Beijing Ecological Observatory Network over the past 4 years.After manual annotation and data augmentation,create a total of 4234 image datasets in 10 categories.Based on ConvNeXt convolutional neural network and combined with the characteristics of wild animal image datasets in Beijing,a BSGG-ConvNeXt model was designed.BlurPool,SENet,global response normalization layer(GRN),and GCNet were used to improve the recognition ability of the model.The impact of training strategies on the recognition accuracy of ConvNeXt network was explored on a self-built dataset.By comparing with other classic models,the advantages of the BSGGConvNeXt model are clarified.Verify the generalization ability of the model using publicly available infrared wildlife snapshot serengeti(SS)dataset andcaltech camera traps(CCT)dataset.【Result】Taking the ConvNeXt size model of the ConvNeXt model as an example,the accuracy in the self-built dataset is 74.13%,and the multiply add cumulative operands(MACs)are 4.47×10^(9).By applying different improvement schemes,it was found that the accuracy increased by 2.2%and MACs decreased to 1.07×10^(9)after using BlurPool.After using SENet,the accuracy improved by 3.2%.After using GRN and removing the scaling layer,the accuracy improved to 87.18%and the number of parameters increased to 27.88×10^(6).After using GCNet,the accuracy was improved to 75.44%without increasing the computational load,but the number of parameters increased to 28.25×10^(6).The BSGG-ConvNeXt obtained by combining the above improvement schemes is applied to the ConvNeXt-T model to obtain the BSGG-ConvNeXt-T model.Although there is a slight increase in the number of parameters,the MACs are reduced to 1.07×10^(9),and the accuracy of the model is improved to 83.63%,which is higher than the original model.After using pre-trained weights,the accuracy of the BSGGConvNeXt-T model can reach 94.07%,which is higher than the accuracy of ResNet-50(76.39%),ResNeXt-50(87.60%),MobileViT(90.00%),DenseNet(87.66%),RegNet(69.90%),ConvNeXtv2(91.93%),SwinTransformer(86.23%),and MobileOne(71.53%)models.After applying the BSGG-ConvNeXt model to four different network sizes of ConvNeXt models,its performance in the self-built dataset is better than that of the unimproved model.The recognition accuracy of the BSGG-ConvNeXt model in the SS dataset can reach 50.28%,and the recognition accuracy in the CCT dataset can reach 56.15%,both of which are higher than the accuracy of the original model.【Conclusion】The BSGG-ConvNeXt model has a higher accuracy in recognizing wild animal images captured by infrared cameras,and performs well on both self built and publicly available wild animal infrared image datasets,with a certain degree of generalization ability.

作者齐建东郑尚姿陈子仪马鐘添 Qi Jiandong;Zheng Shangzi;Chen Ziyi;Ma Zhongtian(College of Information,Beijing Forestry University,Beijing 100083;Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration,Beijing 100083;School of Artificial Intelligence,Tangshan University,Tangshan 063000)

机构地区北京林业大学信息学院国家林业和草原局林业智能信息处理工程技术研究中心唐山学院人工智能学院

出处《林业科学》 EI CAS CSCD 北大核心 2024年第8期33-45,共13页 Scientia Silvae Sinicae

基金国家重点研发计划项目“典型人工林生态系统对全球变化适应机制”(2020YFA0608100) 国家自然科学基金项目“全球变化背景下人工林生态系统质量和稳定性综合评估”(32071842)。

关键词野生动物图像识别深度学习卷积神经网络 ConvNeXt wildlife image recognition deep learning convolutional neural network ConvNeXt

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1申静,曾晴,曾小舟.基于YOLOv8的电动车头盔佩戴检测方法[J].电脑知识与技术,2024,20(23):14-16.
2何卫国.大观园与清代北京园林——以清代笔记为视角[J].苏州科技大学学报(社会科学版),2023,40(6):88-93.
3Qambemeda M. Nyanghura,Jumanne M. Abdallah.Analysis of Economic Efficiency of Wildlife Law Enforcement in Serengeti Ecosystem Tanzania[J].Journal of Environmental Protection,2023,14(7):538-560.
4王洪栋,储杰,高思念,陈晨,曹英华,孙金萍.基于ResNet的石油焦与冶金焦图像分类[J].江苏理工学院学报,2024,30(4):79-84.
5Linus N. Kisoma,Torney Colin.Study of Wildebeest Foraging Processes Using Advection Diffusion Equation: Case of the Serengeti Ecosystem in Tanzania[J].Journal of Applied Mathematics and Physics,2023,11(11):3377-3392.
6无.北京园林学会联合相关单位举办平谷区树木修剪培训活动[J].北京园林,2024,40(2):62-62.
7刘欢,李云红,张蕾涛,郭越,苏雪平,朱耀麟,侯乐乐.基于MA-ConvNext网络和分步关系知识蒸馏的苹果叶片病害识别[J].浙江大学学报（工学版）,2024,58(9):1757-1767.
8李韬,朱文忠,车璇.基于改进ResNet-50与迁移学习的苹果叶片病害的图像识别[J].科学技术与工程,2024,24(24):10370-10381.
9周舒枫,何建勇(摄影).“花园+科技”城市更美丽创新成果彰显花园城市“科技范儿”[J].绿化与生活,2024(6):41-41.
10齐建东,马鐘添,郑尚姿.基于YOLOv7的红外相机野生动物图像筛选[J].北京林业大学学报,2024,46(2):143-154.

林业科学

2024年第8期

浏览历史

内容加载中请稍等...

基于ConvNeXt的北京地区红外相机野生动物图像识别改进模型构建

相关作者

相关机构

相关主题

浏览历史