Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les...Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in re...In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable...Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable scale,and fuzzy edge morphology of insulator defects,we construct an insulator dataset with 1600 samples containing flashovers and breakages.Then a simple and effective surface defect detection method of power line insulators for difficult small objects is proposed.Firstly,a high-resolution featuremap is introduced and a small object prediction layer is added so that the model can detect tiny objects.Secondly,a simplified adaptive spatial feature fusion(SASFF)module is introduced to perform cross-scale spatial fusion to improve adaptability to variable multi-scale features.Finally,we propose an enhanced deformable attention mechanism(EDAM)module.By integrating a gating activation function,the model is further inspired to learn a small number of critical sampling points near reference points.And the module can improve the perception of object morphology.The experimental results indicate that concerning the dataset of flashover and breakage defects,this method improves the performance of YOLOv5,YOLOv7,and YOLOv8.In practical application,it can simply and effectively improve the precision of power line insulator defect detection and reduce missing detection for difficult small objects.展开更多
Recently,object detection based on convolutional neural networks(CNNs)has developed rapidly.The backbone networks for basic feature extraction are an important component of the whole detection task.Therefore,we presen...Recently,object detection based on convolutional neural networks(CNNs)has developed rapidly.The backbone networks for basic feature extraction are an important component of the whole detection task.Therefore,we present a new feature extraction strategy in this paper,which name is DSAFF-Net.In this strategy,we design:1)a sandwich attention feature fusion module(SAFF module).Its purpose is to enhance the semantic information of shallow features and resolution of deep features,which is beneficial to small object detection after feature fusion.2)to add a new stage called D-block to alleviate the disadvantages of decreasing spatial resolution when the pooling layer increases the receptive field.The method proposed in the new stage replaces the original method of obtaining the P6 feature map and uses the result as the input of the regional proposal network(RPN).In the experimental phase,we use the new strategy to extract features.The experiment takes the public dataset of Microsoft Common Objects in Context(MS COCO)object detection and the dataset of Corona Virus Disease 2019(COVID-19)image classification as the experimental object respectively.The results show that the average recognition accuracy of COVID-19 in the classification dataset is improved to 98.163%,and small object detection in object detection tasks is improved by 4.0%.展开更多
The detection of large-scale objects has achieved high accuracy,but due to the low peak signal to noise ratio(PSNR),fewer distinguishing features,and ease of being occluded by the surroundings,the detection of small o...The detection of large-scale objects has achieved high accuracy,but due to the low peak signal to noise ratio(PSNR),fewer distinguishing features,and ease of being occluded by the surroundings,the detection of small objects,however,does not enjoy similar success.Endeavor to solve the problem,this paper proposes an attention mechanism based on cross-Key values.Based on the traditional transformer,this paper first improves the feature processing with the convolution module,effectively maintaining the local semantic context in the middle layer,and significantly reducing the number of parameters of the model.Then,to enhance the effectiveness of the attention mask,two Key values are calculated simultaneously along Query and Value by using the method of dual-branch parallel processing,which is used to strengthen the attention acquisition mode and improve the coupling of key information.Finally,focusing on the feature maps of different channels,the multi-head attention mechanism is applied to the channel attention mask to improve the feature utilization effect of the middle layer.By comparing three small object datasets,the plug-and-play interactive transformer(IT-transformer)module designed by us effectively improves the detection results of the baseline.展开更多
In the past several years,remarkable achievements have been made in the field of object detection.Although performance is generally improving,the accuracy of small object detection remains low compared with that of la...In the past several years,remarkable achievements have been made in the field of object detection.Although performance is generally improving,the accuracy of small object detection remains low compared with that of large object detection.In addition,localization misalignment issues are common for small objects,as seen in GoogLeNets and residual networks(ResNets).To address this problem,we propose an improved region-based fully convolutional network(R-FCN).The presented technique improves detection accuracy and eliminates localization misalignment by replacing positionsensitive region of interest(PS-RoI)pooling with position-sensitive precise region of interest(PS-Pr-RoI)pooling,which avoids coordinate quantization and directly calculates two-order integrals for position-sensitive score maps,thus preventing a loss of spatial precision.A validation experiment was conducted in which the Microsoft common objects in context(MS COCO)training dataset was oversampled.Results showed an accuracy improvement of 3.7%for object detection tasks and an increase of 6.0%for small objects.展开更多
In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convo...In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.展开更多
Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the repres...Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the representation of fuzzy features,one of the challenges now is how to effectively detect small objects in images.Existing target detectors for small objects:one is to use high-resolution images as input,the other is to increase the depth of the CNN network,but these two methods will undoubtedly increase the cost of calculation and time-consuming.In this paper,based on the RefineDet network framework,we propose our network structure RF2Det by introducing Receptive Field Block to solve the problem of small object detection,so as to achieve the balance of speed and accuracy.At the same time,we propose a Medium-level Feature Pyramid Networks,which combines appropriate high-level context features with low-level features,so that the network can use the features of both the low-level and the high-level for multi-scale target detection,and the accuracy of the small target detection task based on the low-level features is improved.Extensive experiments on the MS COCO dataset demonstrate that compared to other most advanced methods,our proposed method shows significant performance improvement in the detection of small objects.展开更多
Knowledge distillation is often used for model compression and has achieved a great breakthrough in image classification,but there still remains scope for improvement in object detection,especially for knowledge extra...Knowledge distillation is often used for model compression and has achieved a great breakthrough in image classification,but there still remains scope for improvement in object detection,especially for knowledge extraction of small objects.The main problem is the features of small objects are often polluted by background noise and not prominent due to down-sampling of convolutional neural network(CNN),resulting in the insufficient refinement of small object features during distillation.In this paper,we propose Hierarchical Matching Knowledge Distillation Network(HMKD)that operates on the pyramid level P2 to pyramid level P4 of the feature pyramid network(FPN),aiming to intervene on small object features before affecting.We employ an encoder-decoder network to encapsulate low-resolution,highly semantic information,akin to eliciting insights from profound strata within a teacher network,and then match the encapsulated information with high-resolution feature values of small objects from shallow layers as the key.During this period,we use an attention mechanism to measure the relevance of the inquiry to the feature values.Also in the process of decoding,knowledge is distilled to the student.In addition,we introduce a supplementary distillation module to mitigate the effects of background noise.Experiments show that our method achieves excellent improvements for both one-stage and twostage object detectors.Specifically,applying the proposed method on Faster R-CNN achieves 41.7%mAP on COCO2017(ResNet50 as the backbone),which is 3.8%higher than that of the baseline.展开更多
With the advancement of society and science and technology, the demand for detecting small objects in practical scenarios becomes stronger. Such objects are only represented by relatively small coverage of pixels, and...With the advancement of society and science and technology, the demand for detecting small objects in practical scenarios becomes stronger. Such objects are only represented by relatively small coverage of pixels, and the features are degraded severely after being extracted by a deep convolutional neural network, which is detrimental to the detection performance for small objects. Therefore, an intuitive solution is to increase the resolution of small objects by cropping the original image. In this paper, we propose a simple but effective object density map guided region localization module (DMGRL) to locate and crop the regions of interest where small objects may exist. Firstly, the density map of the objects is estimated by object density map estimation network, and then the coordinates of the small object regions are calculated;Secondly, the continuous differentiable affine transformation is utilized to crop these regions so that the detector with DMGRL can be trained end-to-end instead of two-stage training. Finally, the all prediction results of input image and cropped region images are merged together to output the final detection results by non maximum suppression (NMS). Extensive experiments demonstrate the superior performance of the detector incorporated DMGRL.展开更多
Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate ...Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.展开更多
A Mach-Zehnder Interferometer incorporated with image processing has been set up to study the temperature field of small objects.Small objects with width to height ratios ranging from 0.33 to 1.0 subjected to natural ...A Mach-Zehnder Interferometer incorporated with image processing has been set up to study the temperature field of small objects.Small objects with width to height ratios ranging from 0.33 to 1.0 subjected to natural convection are used to simulate the condition at which the end effects are significant. Interferogram representing the temperature gradient is captured,digitized,and image processed with the aid of an image processing unit.A simple correction model is proposed to treat the stored interferometric data.The interference data are found to compare well with those provided by the thermocouples.展开更多
It is known that detecting small moving objects in as- tronomical image sequences is a significant research problem in space surveillance. The new theory, compressive sensing, pro- vides a very easy and computationall...It is known that detecting small moving objects in as- tronomical image sequences is a significant research problem in space surveillance. The new theory, compressive sensing, pro- vides a very easy and computationally cheap coding scheme for onboard astronomical remote sensing. An algorithm for small moving space object detection and localization is proposed. The algorithm determines the measurements of objects by comparing the difference between the measurements of the current image and the measurements of the background scene. In contrast to reconstruct the whole image, only a foreground image is recon- structed, which will lead to an effective computational performance, and a high level of localization accuracy is achieved. Experiments and analysis are provided to show the performance of the pro- posed approach on detection and localization.展开更多
A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects...A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.展开更多
文摘Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金supported in part by the National Natural Science Foundation of China under Grant 62006071part by the Science and Technology Research Project of Henan Province under Grant 232103810086.
文摘In recent years,there has been extensive research on object detection methods applied to optical remote sensing images utilizing convolutional neural networks.Despite these efforts,the detection of small objects in remote sensing remains a formidable challenge.The deep network structure will bring about the loss of object features,resulting in the loss of object features and the near elimination of some subtle features associated with small objects in deep layers.Additionally,the features of small objects are susceptible to interference from background features contained within the image,leading to a decline in detection accuracy.Moreover,the sensitivity of small objects to the bounding box perturbation further increases the detection difficulty.In this paper,we introduce a novel approach,Cross-Layer Fusion and Weighted Receptive Field-based YOLO(CAW-YOLO),specifically designed for small object detection in remote sensing.To address feature loss in deep layers,we have devised a cross-layer attention fusion module.Background noise is effectively filtered through the incorporation of Bi-Level Routing Attention(BRA).To enhance the model’s capacity to perceive multi-scale objects,particularly small-scale objects,we introduce a weightedmulti-receptive field atrous spatial pyramid poolingmodule.Furthermore,wemitigate the sensitivity arising from bounding box perturbation by incorporating the joint Normalized Wasserstein Distance(NWD)and Efficient Intersection over Union(EIoU)losses.The efficacy of the proposedmodel in detecting small objects in remote sensing has been validated through experiments conducted on three publicly available datasets.The experimental results unequivocally demonstrate the model’s pronounced advantages in small object detection for remote sensing,surpassing the performance of current mainstream models.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金State Grid Jiangsu Electric Power Co.,Ltd.of the Science and Technology Project(Grant No.J2022004).
文摘Insulator defect detection plays a vital role in maintaining the secure operation of power systems.To address the issues of the difficulty of detecting small objects and missing objects due to the small scale,variable scale,and fuzzy edge morphology of insulator defects,we construct an insulator dataset with 1600 samples containing flashovers and breakages.Then a simple and effective surface defect detection method of power line insulators for difficult small objects is proposed.Firstly,a high-resolution featuremap is introduced and a small object prediction layer is added so that the model can detect tiny objects.Secondly,a simplified adaptive spatial feature fusion(SASFF)module is introduced to perform cross-scale spatial fusion to improve adaptability to variable multi-scale features.Finally,we propose an enhanced deformable attention mechanism(EDAM)module.By integrating a gating activation function,the model is further inspired to learn a small number of critical sampling points near reference points.And the module can improve the perception of object morphology.The experimental results indicate that concerning the dataset of flashover and breakage defects,this method improves the performance of YOLOv5,YOLOv7,and YOLOv8.In practical application,it can simply and effectively improve the precision of power line insulator defect detection and reduce missing detection for difficult small objects.
基金the National Natural Science Foundation of China under grant 62172059 and 62072055Hunan Provincial Natural Science Foundations of China under Grant 2020JJ4626+2 种基金Scientific Research Fund of Hunan Provincial Education Department of China under Grant 19B004“Double First-class”International Cooperation and Development Scientific Research Project of Changsha University of Science and Technology under Grant 2018IC25the Young Teacher Growth Plan Project of Changsha University of Science and Technology under Grant 2019QJCZ076.
文摘Recently,object detection based on convolutional neural networks(CNNs)has developed rapidly.The backbone networks for basic feature extraction are an important component of the whole detection task.Therefore,we present a new feature extraction strategy in this paper,which name is DSAFF-Net.In this strategy,we design:1)a sandwich attention feature fusion module(SAFF module).Its purpose is to enhance the semantic information of shallow features and resolution of deep features,which is beneficial to small object detection after feature fusion.2)to add a new stage called D-block to alleviate the disadvantages of decreasing spatial resolution when the pooling layer increases the receptive field.The method proposed in the new stage replaces the original method of obtaining the P6 feature map and uses the result as the input of the regional proposal network(RPN).In the experimental phase,we use the new strategy to extract features.The experiment takes the public dataset of Microsoft Common Objects in Context(MS COCO)object detection and the dataset of Corona Virus Disease 2019(COVID-19)image classification as the experimental object respectively.The results show that the average recognition accuracy of COVID-19 in the classification dataset is improved to 98.163%,and small object detection in object detection tasks is improved by 4.0%.
文摘The detection of large-scale objects has achieved high accuracy,but due to the low peak signal to noise ratio(PSNR),fewer distinguishing features,and ease of being occluded by the surroundings,the detection of small objects,however,does not enjoy similar success.Endeavor to solve the problem,this paper proposes an attention mechanism based on cross-Key values.Based on the traditional transformer,this paper first improves the feature processing with the convolution module,effectively maintaining the local semantic context in the middle layer,and significantly reducing the number of parameters of the model.Then,to enhance the effectiveness of the attention mask,two Key values are calculated simultaneously along Query and Value by using the method of dual-branch parallel processing,which is used to strengthen the attention acquisition mode and improve the coupling of key information.Finally,focusing on the feature maps of different channels,the multi-head attention mechanism is applied to the channel attention mask to improve the feature utilization effect of the middle layer.By comparing three small object datasets,the plug-and-play interactive transformer(IT-transformer)module designed by us effectively improves the detection results of the baseline.
基金This project was supported by the National Natural Science Foundation of China under grant U1836208the Hunan Provincial Natural Science Foundations of China under Grant 2020JJ4626+2 种基金the Scientific Research Fund of Hunan Provincial Education Department of China under Grant 19B004the“Double First-class”International Cooperation and Development Scientific Research Project of Changsha University of Science and Technology under Grant 2018IC25the Young Teacher Growth Plan Project of Changsha University of Science and Technology under Grant 2019QJCZ076.
文摘In the past several years,remarkable achievements have been made in the field of object detection.Although performance is generally improving,the accuracy of small object detection remains low compared with that of large object detection.In addition,localization misalignment issues are common for small objects,as seen in GoogLeNets and residual networks(ResNets).To address this problem,we propose an improved region-based fully convolutional network(R-FCN).The presented technique improves detection accuracy and eliminates localization misalignment by replacing positionsensitive region of interest(PS-RoI)pooling with position-sensitive precise region of interest(PS-Pr-RoI)pooling,which avoids coordinate quantization and directly calculates two-order integrals for position-sensitive score maps,thus preventing a loss of spatial precision.A validation experiment was conducted in which the Microsoft common objects in context(MS COCO)training dataset was oversampled.Results showed an accuracy improvement of 3.7%for object detection tasks and an increase of 6.0%for small objects.
基金National Defense Pre-research Fund Project(No.KMGY318002531)。
文摘In order to solve the problem of small objects detection in unmanned aerial vehicle(UAV)aerial images with complex background,a general detection method for multi-scale small objects based on Faster region-based convolutional neural network(Faster R-CNN)is proposed.The bird’s nest on the high-voltage tower is taken as the research object.Firstly,we use the improved convolutional neural network ResNet101 to extract object features,and then use multi-scale sliding windows to obtain the object region proposals on the convolution feature maps with different resolutions.Finally,a deconvolution operation is added to further enhance the selected feature map with higher resolution,and then it taken as a feature mapping layer of the region proposals passing to the object detection sub-network.The detection results of the bird’s nest in UAV aerial images show that the proposed method can precisely detect small objects in aerial images.
文摘Object detection has been studied for many years.The convolutional neural network has made great progress in the accuracy and speed of object detection.However,due to the low resolution of small objects and the representation of fuzzy features,one of the challenges now is how to effectively detect small objects in images.Existing target detectors for small objects:one is to use high-resolution images as input,the other is to increase the depth of the CNN network,but these two methods will undoubtedly increase the cost of calculation and time-consuming.In this paper,based on the RefineDet network framework,we propose our network structure RF2Det by introducing Receptive Field Block to solve the problem of small object detection,so as to achieve the balance of speed and accuracy.At the same time,we propose a Medium-level Feature Pyramid Networks,which combines appropriate high-level context features with low-level features,so that the network can use the features of both the low-level and the high-level for multi-scale target detection,and the accuracy of the small target detection task based on the low-level features is improved.Extensive experiments on the MS COCO dataset demonstrate that compared to other most advanced methods,our proposed method shows significant performance improvement in the detection of small objects.
基金supported in part by the Joint Fund of the Ministry of Education for Equipment Pre-Research of China under Grant No.8091B032257the National Natural Science Foundation of China under Grant Nos.62106232 and 62372415+1 种基金the China Postdoctoral Science Foundation under Grant No.2021TQ0301the Outstanding Youth Science Fund of Henan Province of China under Grant No.242300421050.
文摘Knowledge distillation is often used for model compression and has achieved a great breakthrough in image classification,but there still remains scope for improvement in object detection,especially for knowledge extraction of small objects.The main problem is the features of small objects are often polluted by background noise and not prominent due to down-sampling of convolutional neural network(CNN),resulting in the insufficient refinement of small object features during distillation.In this paper,we propose Hierarchical Matching Knowledge Distillation Network(HMKD)that operates on the pyramid level P2 to pyramid level P4 of the feature pyramid network(FPN),aiming to intervene on small object features before affecting.We employ an encoder-decoder network to encapsulate low-resolution,highly semantic information,akin to eliciting insights from profound strata within a teacher network,and then match the encapsulated information with high-resolution feature values of small objects from shallow layers as the key.During this period,we use an attention mechanism to measure the relevance of the inquiry to the feature values.Also in the process of decoding,knowledge is distilled to the student.In addition,we introduce a supplementary distillation module to mitigate the effects of background noise.Experiments show that our method achieves excellent improvements for both one-stage and twostage object detectors.Specifically,applying the proposed method on Faster R-CNN achieves 41.7%mAP on COCO2017(ResNet50 as the backbone),which is 3.8%higher than that of the baseline.
基金Supported by the National Center ATC Surveillance and Communication System Engineering Research。
文摘With the advancement of society and science and technology, the demand for detecting small objects in practical scenarios becomes stronger. Such objects are only represented by relatively small coverage of pixels, and the features are degraded severely after being extracted by a deep convolutional neural network, which is detrimental to the detection performance for small objects. Therefore, an intuitive solution is to increase the resolution of small objects by cropping the original image. In this paper, we propose a simple but effective object density map guided region localization module (DMGRL) to locate and crop the regions of interest where small objects may exist. Firstly, the density map of the objects is estimated by object density map estimation network, and then the coordinates of the small object regions are calculated;Secondly, the continuous differentiable affine transformation is utilized to crop these regions so that the detector with DMGRL can be trained end-to-end instead of two-stage training. Finally, the all prediction results of input image and cropped region images are merged together to output the final detection results by non maximum suppression (NMS). Extensive experiments demonstrate the superior performance of the detector incorporated DMGRL.
文摘Infrared small target detection is a common task in infrared image processing.Under limited computa⁃tional resources.Traditional methods for infrared small target detection face a trade-off between the detection rate and the accuracy.A fast infrared small target detection method tailored for resource-constrained conditions is pro⁃posed for the YOLOv5s model.This method introduces an additional small target detection head and replaces the original Intersection over Union(IoU)metric with Normalized Wasserstein Distance(NWD),while considering both the detection accuracy and the detection speed of infrared small targets.Experimental results demonstrate that the proposed algorithm achieves a maximum effective detection speed of 95 FPS on a 15 W TPU,while reach⁃ing a maximum effective detection accuracy of 91.9 AP@0.5,effectively improving the efficiency of infrared small target detection under resource-constrained conditions.
文摘A Mach-Zehnder Interferometer incorporated with image processing has been set up to study the temperature field of small objects.Small objects with width to height ratios ranging from 0.33 to 1.0 subjected to natural convection are used to simulate the condition at which the end effects are significant. Interferogram representing the temperature gradient is captured,digitized,and image processed with the aid of an image processing unit.A simple correction model is proposed to treat the stored interferometric data.The interference data are found to compare well with those provided by the thermocouples.
基金supported by the National Natural Science Foundation of China (60903126)the China Postdoctoral Special Science Foundation (201003685)+1 种基金the China Postdoctoral Science Foundation (20090451397)the Northwestern Polytechnical University Foundation for Fundamental Research (JC201120)
文摘It is known that detecting small moving objects in as- tronomical image sequences is a significant research problem in space surveillance. The new theory, compressive sensing, pro- vides a very easy and computationally cheap coding scheme for onboard astronomical remote sensing. An algorithm for small moving space object detection and localization is proposed. The algorithm determines the measurements of objects by comparing the difference between the measurements of the current image and the measurements of the background scene. In contrast to reconstruct the whole image, only a foreground image is recon- structed, which will lead to an effective computational performance, and a high level of localization accuracy is achieved. Experiments and analysis are provided to show the performance of the pro- posed approach on detection and localization.
文摘A literature analysis has shown that object search,recognition,and tracking systems are becoming increasingly popular.However,such systems do not achieve high practical results in analyzing small moving living objects ranging from 8 to 14 mm.This article examines methods and tools for recognizing and tracking the class of small moving objects,such as ants.To fulfill those aims,a customized You Only Look Once Ants Recognition(YOLO_AR)Convolutional Neural Network(CNN)has been trained to recognize Messor Structor ants in the laboratory using the LabelImg object marker tool.The proposed model is an extension of the You Only Look Once v4(Yolov4)512×512 model with an additional Self Regularized Non–Monotonic(Mish)activation function.Additionally,the scalable solution for continuous object recognizing and tracking was implemented.This solution is based on the OpenDatacam system,with extended Object Tracking modules that allow for tracking and counting objects that have crossed the custom boundary line.During the study,the methods of the alignment algorithm for finding the trajectory of moving objects were modified.I discovered that the Hungarian algorithm showed better results in tracking small objects than the K–D dimensional tree(k-d tree)matching algorithm used in OpenDataCam.Remarkably,such an algorithm showed better results with the implemented YOLO_AR model due to the lack of False Positives(FP).Therefore,I provided a new tracker module with a Hungarian matching algorithm verified on the Multiple Object Tracking(MOT)benchmark.Furthermore,additional customization parameters for object recognition and tracking results parsing and filtering were added,like boundary angle threshold(BAT)and past frames trajectory prediction(PFTP).Experimental tests confirmed the results of the study on a mobile device.During the experiment,parameters such as the quality of recognition and tracking of moving objects,the PFTP and BAT,and the configuration parameters of the neural network and boundary line model were analyzed.The results showed an increased tracking accuracy with the proposed methods by 50%.The study results confirmed the relevance of the topic and the effectiveness of the implemented methods and tools.