摘要
针对红外图像信息维度单一且弱小目标因特征不明显而难以检测的问题,将不同结构的多滤波器融入YOLOv5n网络,根据增强弱小目标和抑制背景干扰的不同特性分别选择三个异构滤波器作用于网络的多通道输入图像,从而丰富原始图像的信息维度,有效提升后端网络对复杂背景下弱小目标的适应能力;通过添加注意力模块、采用小锚框策略、裁剪网络深层分支等改进措施,在增强YOLOv5n网络弱小目标检测能力的同时,进一步减少了计算和存储资源需求。实验结果表明,所提出的算法能够有效检测红外复杂背景中的弱小目标,同时占用存储和计算资源更少,为算法部署在资源受限的嵌入式设备上提供了基础。
Objective Dim small target detection in infrared images with complex backgrounds is a key technology for precise guidance systems and infrared surveillance systems,and the detection performance directly determines the success or failure of tasks.As a result,it has become a hot topic,and different detection methods have been presented.Compared with traditional algorithms,deep network algorithms have achieved remarkable results in many aspects in recent years,and some frameworks designed based on existing deep networks have been applied to detect the dim small target.Although these methods can improve the detection performance of small targets by modifying the network structure because the infrared images have only information of one dimension and limited features in small targets,it is difficult to obtain satisfactory results when the deep network is directly applied to detect dim small targets in the complex infrared background,and the large network scale makes it difficult to deploy the above methods on the embedded platform with constrained resources.Methods In view of the characteristics of single information dimension in infrared images and inconspicuous features of dim small targets,this study enriches the information of original images and incorporates multiple filters with different structures into the YOLOv5n network.In this study,three filters with different structures,namely the Top Hat filter,difference of Gaussian filter(DoG),and mean filter,are selected from the perspective of highlighting targets,suppressing backgrounds,and filtering high-frequency noises.By introducing three heterogeneous filters to process the images in the input layer of the network,the one-dimensional gray information of the original image is expanded into three dimensions,and then they are fed to the network through three channels,which improves the adaptability of the network to dim small targets in complex backgrounds.YOLOV5n network is selected in this study and improved as follows.1)In order to make the deep network improve the feature weight of the region of interest and suppress the response of the unrelated region during training,the lightweight convolutional block attention module(CBAM)is added to the backbone of YOLOv5n so that the extracted feature map can play a greater role in the subsequent target extraction.The output in the convolution layer first passes through the channel attention module(CAM)to improve the weight of target-related features and then through the spatial attention module(SAM),which enables the weighted feature to remain in the deeper network.2)In the standard YOLOV5n network,target detection is carried out using the feature maps of P17,P20,and P23 layers.In the process of target extraction,targets are searched and selected through preset anchor boxes of different sizes.Since the shallow network has a feature map with a large size and contains rich original information,it is conducive to small target detection.Therefore,this study adjusts the size of the anchor in P3 layer to[5,6,6,8,9,11],which is beneficial to small target detection.3)The perception field of view of the shallow network is small,which is conducive to extracting the local features of the target.The deep network has a large perception field of view,which is mainly used to extract the global features of the target.For the application scenario of small target detection,the features extracted by the deep network are limited and may even interfere with the final detection results.After multi-layer feature extraction of the backbone network,the deep network almost does not contain small target features,so the standard YOLOv5n network is cropped to remove P5-P23 layers,and only P3-P17 and P4-P20 output features are used for detection.By adding attention modules,adopting small anchor strategies,and cutting deep branches of the network,the dim small target detection performance of the YOLOv5n network is improved,and the consumption of computational and storage resources is reduced.Results and Discussions In order to verify the performance of the algorithm,this study selects the dim small target detection and tracking infrared dataset against the ground/air background.Multiple deep network algorithms dedicated to small target detection are selected for comparison.Furthermore,the classical target detection network algorithms which are modified for small target detection are selected.In terms of detection performance,the proposed algorithm obtains the second-highest average precision(AP)value of 0.888,which is 1.4%lower than the highest value and 3%higher than the third-highest value.In terms of network size and computational efficiency,the proposed algorithm achieves the fastest processing speed of 416 frame/s at the smallest network size(3 MB),and the network size is half that of the algorithm in Ref.[7].Compared with the algorithm with the best detection performance,the proposed algorithm performs approximately 60 times more efficiently,and the network size is approximately 1/16.This study analyzes the performance gains of improvement measures, such as introducing multi-heterogeneous filters, adding attention modules and small anchor box strategies, and cropping deep networks. The experimental results show that the proposed algorithm can still maintain an excellent detection performance with the smallest parameter size and the highest operational efficiency.Conclusions In order to improve the detection performance of dim small targets and enhance the deployment ability of algorithms, a light dim small target detection network with multi-heterogeneous filters is proposed. Experimental results show that the proposed algorithm can detect dim small targets in the complex infrared background effectively. In addition,fewer computational and storage resources are consumed, which lays a foundation for deployment on the embedded platform with constrained resources.
作者
赵菲
邓英捷
Zhao Fei;Deng Yingjie(National Key Laboratory of Science and Technology on ATR,College of Electronic Science and Technology,National University of Defense Technology,Changsha 410073,Hunan,China)
出处
《光学学报》
EI
CAS
CSCD
北大核心
2023年第9期145-156,共12页
Acta Optica Sinica
基金
国家自然科学基金青年基金(61901489)。