Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain les...Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.展开更多
While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection ...While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps.展开更多
In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swa...In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.展开更多
Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale...Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects.展开更多
In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid...In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.展开更多
Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image ...Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance.展开更多
为了解决金属表面缺陷检测的漏检、误检等问题,提出了一种改进YOLOv3算法。首先,使用动态激活函数替换主干特征提取网络中所有残差块的激活函数,并加入了混合注意力机制,强化其对复杂缺陷目标的特征提取能力。然后,在特征金字塔网络部...为了解决金属表面缺陷检测的漏检、误检等问题,提出了一种改进YOLOv3算法。首先,使用动态激活函数替换主干特征提取网络中所有残差块的激活函数,并加入了混合注意力机制,强化其对复杂缺陷目标的特征提取能力。然后,在特征金字塔网络部分新增一个104×104的特征层,并将浅层网络与深层网络进行逐层特征融合,增强算法对小缺陷目标检测的敏感性。最后,利用K-Means++聚类算法替换K-Means聚类算法,筛选出适用于金属表面缺陷检测的最优先验框尺寸,使目标定位更加准确。实验结果表明,改进YOLOv3算法的每秒检测帧数(frames per second,FPS)可达到32.3,平均精度均值(mean average precision,mAP)可达到78.69%,检测性能得到了明显提升。展开更多
针对红外图像纹理弱及多目标遮挡导致跟踪精度低的问题,构建了基于改进YOLOv7模型和多目标跟踪算法DeepSort的融合红外目标跟踪模型MSB-YOLOv7-DeepSort。采用SE(squeeze and excitation)通道注意力机制和双向特征金字塔网络提高红外目...针对红外图像纹理弱及多目标遮挡导致跟踪精度低的问题,构建了基于改进YOLOv7模型和多目标跟踪算法DeepSort的融合红外目标跟踪模型MSB-YOLOv7-DeepSort。采用SE(squeeze and excitation)通道注意力机制和双向特征金字塔网络提高红外目标的特征提取质量;利用轻量化网络MobileNetV3替换YOLOv7骨干网络,提升融合模型的推理速度。实验结果表明,MSB-YOLOv7-DeepSort模型在跟踪准确度、跟踪精确度、正确目标跟踪比例和帧率等方面均具有较好的性能。展开更多
文摘Effective small object detection is crucial in various applications including urban intelligent transportation and pedestrian detection.However,small objects are difficult to detect accurately because they contain less information.Many current methods,particularly those based on Feature Pyramid Network(FPN),address this challenge by leveraging multi-scale feature fusion.However,existing FPN-based methods often suffer from inadequate feature fusion due to varying resolutions across different layers,leading to suboptimal small object detection.To address this problem,we propose the Two-layerAttention Feature Pyramid Network(TA-FPN),featuring two key modules:the Two-layer Attention Module(TAM)and the Small Object Detail Enhancement Module(SODEM).TAM uses the attention module to make the network more focused on the semantic information of the object and fuse it to the lower layer,so that each layer contains similar semantic information,to alleviate the problem of small object information being submerged due to semantic gaps between different layers.At the same time,SODEM is introduced to strengthen the local features of the object,suppress background noise,enhance the information details of the small object,and fuse the enhanced features to other feature layers to ensure that each layer is rich in small object information,to improve small object detection accuracy.Our extensive experiments on challenging datasets such as Microsoft Common Objects inContext(MSCOCO)and Pattern Analysis Statistical Modelling and Computational Learning,Visual Object Classes(PASCAL VOC)demonstrate the validity of the proposedmethod.Experimental results show a significant improvement in small object detection accuracy compared to state-of-theart detectors.
基金supported by the Program of Introducing Talents of Discipline to Universities(111 Plan)of China(B14010)the National Natural Science Foundation of China(31727901)
文摘While moving ahead with the object detection technology, especially deep neural networks, many related tasks, such as medical application and industrial automation, have achieved great success. However, the detection of objects with multiple aspect ratios and scales is still a key problem. This paper proposes a top-down and bottom-up feature pyramid network(TDBU-FPN),which combines multi-scale feature representation and anchor generation at multiple aspect ratios. First, in order to build the multi-scale feature map, this paper puts a number of fully convolutional layers after the backbone. Second, to link neighboring feature maps, top-down and bottom-up flows are adopted to introduce context information via top-down flow and supplement suboriginal information via bottom-up flow. The top-down flow refers to the deconvolution procedure, and the bottom-up flow refers to the pooling procedure. Third, the problem of adapting different object aspect ratios is tackled via many anchor shapes with different aspect ratios on each multi-scale feature map. The proposed method is evaluated on the pattern analysis, statistical modeling and computational learning visual object classes(PASCAL VOC)dataset and reaches an accuracy of 79%, which exhibits a 1.8% improvement with a detection speed of 23 fps.
文摘In this paper,based on a bidirectional parallel multi-branch feature pyramid network(BPMFPN),a novel one-stage object detector called BPMFPN Det is proposed for real-time detection of ground multi-scale targets by swarm unmanned aerial vehicles(UAVs).First,the bidirectional parallel multi-branch convolution modules are used to construct the feature pyramid to enhance the feature expression abilities of different scale feature layers.Next,the feature pyramid is integrated into the single-stage object detection framework to ensure real-time performance.In order to validate the effectiveness of the proposed algorithm,experiments are conducted on four datasets.For the PASCAL VOC dataset,the proposed algorithm achieves the mean average precision(mAP)of 85.4 on the VOC 2007 test set.With regard to the detection in optical remote sensing(DIOR)dataset,the proposed algorithm achieves 73.9 mAP.For vehicle detection in aerial imagery(VEDAI)dataset,the detection accuracy of small land vehicle(slv)targets reaches 97.4 mAP.For unmanned aerial vehicle detection and tracking(UAVDT)dataset,the proposed BPMFPN Det achieves the mAP of 48.75.Compared with the previous state-of-the-art methods,the results obtained by the proposed algorithm are more competitive.The experimental results demonstrate that the proposed algorithm can effectively solve the problem of real-time detection of ground multi-scale targets in aerial images of swarm UAVs.
基金supported by the National Natural Science Foundation of China(No.61901016)the special fund for basic scientific research in central colleges and universities-Youth talent support program of Beihang University。
文摘Object detection could be recognized as an essential part of the research to scenarios such as automatic driving and pedestrian detection, etc. Among multiple types of target objects, the identification of small-scale objects faces significant challenges. We would introduce a new feature pyramid framework called Dual Attention based Feature Pyramid Network(DAFPN), which is designed to avoid predicament about multi-scale object recognition. In DAFPN, the attention mechanism is introduced by calculating the topdown pathway and lateral pathway, where the spatial attention, as well as channel attention, would participate, respectively, such that the pyramidal feature maps can be generated with enhanced spatial and channel interdependencies, which bring more semantical information for the feature pyramid. Using the COCO data set, which consists of a considerable quantity of small-scale objects, the experiments are implemented. The analysis results verify the optimized performance of DAFPN compared with the original Feature Pyramid Network(FPN) specifically for the identification on a small scale. The proposed DAFPN is promising for object detection in an era full of intelligent machines that need to detect multi-scale objects.
基金The National Natural Science Foundation of China(No.61603091)。
文摘In order to improve the detection accuracy of small objects,a neighborhood fusion-based hierarchical parallel feature pyramid network(NFPN)is proposed.Unlike the layer-by-layer structure adopted in the feature pyramid network(FPN)and deconvolutional single shot detector(DSSD),where the bottom layer of the feature pyramid network relies on the top layer,NFPN builds the feature pyramid network with no connections between the upper and lower layers.That is,it only fuses shallow features on similar scales.NFPN is highly portable and can be embedded in many models to further boost performance.Extensive experiments on PASCAL VOC 2007,2012,and COCO datasets demonstrate that the NFPN-based SSD without intricate tricks can exceed the DSSD model in terms of detection accuracy and inference speed,especially for small objects,e.g.,4%to 5%higher mAP(mean average precision)than SSD,and 2%to 3%higher mAP than DSSD.On VOC 2007 test set,the NFPN-based SSD with 300×300 input reaches 79.4%mAP at 34.6 frame/s,and the mAP can raise to 82.9%after using the multi-scale testing strategy.
基金This work was supported in part by National Natural Science Foundation of China under Grant Nos.11725211,52005505,and 62001502Post-graduate Scientific Research Innovation Project of Hunan Province under Grant No.CX20200023.
文摘Deep learning for topology optimization has been extensively studied to reduce the cost of calculation in recent years.However,the loss function of the above method is mainly based on pixel-wise errors from the image perspective,which cannot embed the physical knowledge of topology optimization.Therefore,this paper presents an improved deep learning model to alleviate the above difficulty effectively.The feature pyramid network(FPN),a kind of deep learning model,is trained to learn the inherent physical law of topology optimization itself,of which the loss function is composed of pixel-wise errors and physical constraints.Since the calculation of physical constraints requires finite element analysis(FEA)with high calculating costs,the strategy of adjusting the time when physical constraints are added is proposed to achieve the balance between the training cost and the training effect.Then,two classical topology optimization problems are investigated to verify the effectiveness of the proposed method.The results show that the developed model using a small number of samples can quickly obtain the optimization structure without any iteration,which has not only high pixel-wise accuracy but also good physical performance.
文摘为了解决金属表面缺陷检测的漏检、误检等问题,提出了一种改进YOLOv3算法。首先,使用动态激活函数替换主干特征提取网络中所有残差块的激活函数,并加入了混合注意力机制,强化其对复杂缺陷目标的特征提取能力。然后,在特征金字塔网络部分新增一个104×104的特征层,并将浅层网络与深层网络进行逐层特征融合,增强算法对小缺陷目标检测的敏感性。最后,利用K-Means++聚类算法替换K-Means聚类算法,筛选出适用于金属表面缺陷检测的最优先验框尺寸,使目标定位更加准确。实验结果表明,改进YOLOv3算法的每秒检测帧数(frames per second,FPS)可达到32.3,平均精度均值(mean average precision,mAP)可达到78.69%,检测性能得到了明显提升。
文摘针对红外图像纹理弱及多目标遮挡导致跟踪精度低的问题,构建了基于改进YOLOv7模型和多目标跟踪算法DeepSort的融合红外目标跟踪模型MSB-YOLOv7-DeepSort。采用SE(squeeze and excitation)通道注意力机制和双向特征金字塔网络提高红外目标的特征提取质量;利用轻量化网络MobileNetV3替换YOLOv7骨干网络,提升融合模型的推理速度。实验结果表明,MSB-YOLOv7-DeepSort模型在跟踪准确度、跟踪精确度、正确目标跟踪比例和帧率等方面均具有较好的性能。