Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false...Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.展开更多
Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variati...Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.展开更多
Aimed at solving the problems of road network object selection at any unknown scale, the existing methods on object selection are integrated and extended in this paper, and a new object interpolation method is propose...Aimed at solving the problems of road network object selection at any unknown scale, the existing methods on object selection are integrated and extended in this paper, and a new object interpolation method is proposed, which reflects the inheritable and transferable characteristics of related information among multi-scale representation objects, and takes the attribute effects into account. Then the basic idea, the overall framework and the technical flow of the interpolation are put forward, at the samet:me synthetical weight function of the interpolation method is defined and described. The method and technical strategies of object selection are extended, and the key problems are solved, including the dejign of the objective quantitative and structural selections based on the weight values, the interpolation experiment strategies and technical flows, the result of the test shows that the object interpolation method not only inherits the objects at smaller scales, but also takes the attribute effect into account when deriving objects from larger scales according to the road importance, which is a guarantee to objective selection of the road objects at middle scales.展开更多
In this paper,we propose a new method to achieve automatic matching of multi-scale roads under the constraints of smaller scale data.The matching process is:Firstly,meshes are extracted from two different scales road ...In this paper,we propose a new method to achieve automatic matching of multi-scale roads under the constraints of smaller scale data.The matching process is:Firstly,meshes are extracted from two different scales road data.Secondly,several basic meshes in the larger scale road network will be merged into a composite one which is matched with one mesh in the smaller scale road network,to complete the N∶1(N>1)and 1∶1 matching.Thirdly,meshes of the two different scale road data with M∶N(M>1,N>1)matching relationships will be matched.Finally,roads will be classified into two categories under the constraints of meshes:mesh boundary roads and mesh internal roads,and then matchings between the two scales meshes will be carried out within their own categories according to the matching relationships.The results show that roads of different scales will be more precisely matched using the proposed method.展开更多
Automated object detection has received the most attention over the years.Use cases ranging from autonomous driving applications to military surveillance systems,require robust detection of objects in different illumi...Automated object detection has received the most attention over the years.Use cases ranging from autonomous driving applications to military surveillance systems,require robust detection of objects in different illumination conditions.State-of-the-art object detectors tend to fare well in object detection during daytime conditions.However,their performance is severely hampered in night light conditions due to poor illumination.To address this challenge,the manuscript proposes an improved YOLOv5-based object detection framework for effective detection in unevenly illuminated nighttime conditions.Firstly,the preprocessing strategies involve using the Zero-DCE++approach to enhance lowlight images.It is followed by optimizing the existing YOLOv5 architecture by integrating the Convolutional Block Attention Module(CBAM)in the backbone network to boost model learning capability and Depthwise Convolutional module(DWConv)in the neck network for efficient compression of network parameters.The Night Object Detection(NOD)and Exclusively Dark(ExDARK)dataset has been used for this work.The proposed framework detects classes like humans,bicycles,and cars.Experiments demonstrate that the proposed architecture achieved a higher Mean Average Precision(mAP)along with a reduction in model size and total parameters,respectively.The proposed model is lighter by 11.24%in terms of model size and 12.38%in terms of parameters when compared to baseline YOLOv5.展开更多
In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has imp...In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has important practical significance in the fields of automatic driving,transportation planning,and intelligent transportation systems.However,the current mainstream lightweight semantic segmentation models in road scene segmentation face problems such as poor segmentation performance of small targets and insufficient refinement of segmentation edges.Therefore,this article proposes a lightweight semantic segmentation model based on the LiteSeg model improvement to address these issues.The model uses the lightweight backbone network MobileNet instead of the LiteSeg backbone network to reduce the network parameters and computation,and combines the Coordinate Attention(CA)mechanism to help the network capture long-distance dependencies.At the same time,by combining the dependencies of spatial information and channel information,the Spatial and Channel Network(SCNet)attention mechanism is proposed to improve the feature extraction ability of the model.Finally,a multiscale transposed attention encoding(MTAE)module was proposed to obtain features of different resolutions and perform feature fusion.In this paper,the proposed model is verified on the Cityscapes dataset.The experimental results show that the addition of SCNet and MTAE modules increases the mean Intersection over Union(mIoU)of the original LiteSeg model by 4.69%.On this basis,the backbone network is replaced with MobileNet,and the CA model is added at the same time.At the cost of increasing the minimum model parameters and computing costs,the mIoU of the original LiteSeg model is increased by 2.46%.This article also compares the proposed model with some current lightweight semantic segmentation models,and experiments show that the comprehensive performance of the proposed model is the best,especially in achieving excellent results in small object segmentation.Finally,this article will conduct generalization testing on the KITTI dataset for the proposed model,and the experimental results show that the proposed algorithm has a certain degree of generalization.展开更多
公路洒落物是影响交通安全的重要因素之一,为了解决中小尺度公路洒落物检测中的漏检、误检以及难以定位等问题,本文提出了一种图像引导和点云空间约束的公路洒落物检测定位方法。该方法使用改进的YOLOv7-OD网络处理图像数据获取二维目...公路洒落物是影响交通安全的重要因素之一,为了解决中小尺度公路洒落物检测中的漏检、误检以及难以定位等问题,本文提出了一种图像引导和点云空间约束的公路洒落物检测定位方法。该方法使用改进的YOLOv7-OD网络处理图像数据获取二维目标预测框信息,将目标预测框投影到激光雷达坐标系下得到锥形感兴趣区域(region of interest,ROI)。在ROI区域内的点云空间约束下,联合点云聚类和点云生成算法获得不同尺度的洒落物在三维空间中的检测定位结果。实验表明:改进的YOLOv7-OD网络在中尺度目标上的召回率和平均精度分别为85.4%和82.0%,相比YOLOv7网络分别提升6.6和8.0个百分点;在小尺度目标上的召回率和平均精度分别为66.8%和57.3%,均提升5.3个百分点;洒落物定位方面,对于距离检测车辆30~40 m处的目标,深度定位误差为0.19 m,角度定位误差为0.082°,实现了多尺度公路洒落物的检测和定位。展开更多
基金the Scientific Research Fund of Hunan Provincial Education Department(23A0423).
文摘Remote sensing imagery,due to its high altitude,presents inherent challenges characterized by multiple scales,limited target areas,and intricate backgrounds.These inherent traits often lead to increased miss and false detection rates when applying object recognition algorithms tailored for remote sensing imagery.Additionally,these complexities contribute to inaccuracies in target localization and hinder precise target categorization.This paper addresses these challenges by proposing a solution:The YOLO-MFD model(YOLO-MFD:Remote Sensing Image Object Detection withMulti-scale Fusion Dynamic Head).Before presenting our method,we delve into the prevalent issues faced in remote sensing imagery analysis.Specifically,we emphasize the struggles of existing object recognition algorithms in comprehensively capturing critical image features amidst varying scales and complex backgrounds.To resolve these issues,we introduce a novel approach.First,we propose the implementation of a lightweight multi-scale module called CEF.This module significantly improves the model’s ability to comprehensively capture important image features by merging multi-scale feature information.It effectively addresses the issues of missed detection and mistaken alarms that are common in remote sensing imagery.Second,an additional layer of small target detection heads is added,and a residual link is established with the higher-level feature extraction module in the backbone section.This allows the model to incorporate shallower information,significantly improving the accuracy of target localization in remotely sensed images.Finally,a dynamic head attentionmechanism is introduced.This allows themodel to exhibit greater flexibility and accuracy in recognizing shapes and targets of different sizes.Consequently,the precision of object detection is significantly improved.The trial results show that the YOLO-MFD model shows improvements of 6.3%,3.5%,and 2.5%over the original YOLOv8 model in Precision,map@0.5 and map@0.5:0.95,separately.These results illustrate the clear advantages of the method.
基金the Key Research and Development Program of Hainan Province(Grant Nos.ZDYF2023GXJS163,ZDYF2024GXJS014)National Natural Science Foundation of China(NSFC)(Grant Nos.62162022,62162024)+2 种基金the Major Science and Technology Project of Hainan Province(Grant No.ZDKJ2020012)Hainan Provincial Natural Science Foundation of China(Grant No.620MS021)Youth Foundation Project of Hainan Natural Science Foundation(621QN211).
文摘Accurately identifying small objects in high-resolution aerial images presents a complex and crucial task in thefield of small object detection on unmanned aerial vehicles(UAVs).This task is challenging due to variations inUAV flight altitude,differences in object scales,as well as factors like flight speed and motion blur.To enhancethe detection efficacy of small targets in drone aerial imagery,we propose an enhanced You Only Look Onceversion 7(YOLOv7)algorithm based on multi-scale spatial context.We build the MSC-YOLO model,whichincorporates an additional prediction head,denoted as P2,to improve adaptability for small objects.We replaceconventional downsampling with a Spatial-to-Depth Convolutional Combination(CSPDC)module to mitigatethe loss of intricate feature details related to small objects.Furthermore,we propose a Spatial Context Pyramidwith Multi-Scale Attention(SCPMA)module,which captures spatial and channel-dependent features of smalltargets acrossmultiple scales.This module enhances the perception of spatial contextual features and the utilizationof multiscale feature information.On the Visdrone2023 and UAVDT datasets,MSC-YOLO achieves remarkableresults,outperforming the baseline method YOLOv7 by 3.0%in terms ofmean average precision(mAP).The MSCYOLOalgorithm proposed in this paper has demonstrated satisfactory performance in detecting small targets inUAV aerial photography,providing strong support for practical applications.
基金Supported by the National Natural Science Foundation of China (No. 40701147), the Natural Science Foundation of Beijing (No. 8102014), and the Posoctoral Science Foundation of China (Special Issue) (No. 200801096).
文摘Aimed at solving the problems of road network object selection at any unknown scale, the existing methods on object selection are integrated and extended in this paper, and a new object interpolation method is proposed, which reflects the inheritable and transferable characteristics of related information among multi-scale representation objects, and takes the attribute effects into account. Then the basic idea, the overall framework and the technical flow of the interpolation are put forward, at the samet:me synthetical weight function of the interpolation method is defined and described. The method and technical strategies of object selection are extended, and the key problems are solved, including the dejign of the objective quantitative and structural selections based on the weight values, the interpolation experiment strategies and technical flows, the result of the test shows that the object interpolation method not only inherits the objects at smaller scales, but also takes the attribute effect into account when deriving objects from larger scales according to the road importance, which is a guarantee to objective selection of the road objects at middle scales.
基金The National Natural Science Foundation of China(Nos.4110136241471386)。
文摘In this paper,we propose a new method to achieve automatic matching of multi-scale roads under the constraints of smaller scale data.The matching process is:Firstly,meshes are extracted from two different scales road data.Secondly,several basic meshes in the larger scale road network will be merged into a composite one which is matched with one mesh in the smaller scale road network,to complete the N∶1(N>1)and 1∶1 matching.Thirdly,meshes of the two different scale road data with M∶N(M>1,N>1)matching relationships will be matched.Finally,roads will be classified into two categories under the constraints of meshes:mesh boundary roads and mesh internal roads,and then matchings between the two scales meshes will be carried out within their own categories according to the matching relationships.The results show that roads of different scales will be more precisely matched using the proposed method.
文摘Automated object detection has received the most attention over the years.Use cases ranging from autonomous driving applications to military surveillance systems,require robust detection of objects in different illumination conditions.State-of-the-art object detectors tend to fare well in object detection during daytime conditions.However,their performance is severely hampered in night light conditions due to poor illumination.To address this challenge,the manuscript proposes an improved YOLOv5-based object detection framework for effective detection in unevenly illuminated nighttime conditions.Firstly,the preprocessing strategies involve using the Zero-DCE++approach to enhance lowlight images.It is followed by optimizing the existing YOLOv5 architecture by integrating the Convolutional Block Attention Module(CBAM)in the backbone network to boost model learning capability and Depthwise Convolutional module(DWConv)in the neck network for efficient compression of network parameters.The Night Object Detection(NOD)and Exclusively Dark(ExDARK)dataset has been used for this work.The proposed framework detects classes like humans,bicycles,and cars.Experiments demonstrate that the proposed architecture achieved a higher Mean Average Precision(mAP)along with a reduction in model size and total parameters,respectively.The proposed model is lighter by 11.24%in terms of model size and 12.38%in terms of parameters when compared to baseline YOLOv5.
基金the National Natural Science Foundation of China(No.62063006)the Natural Science Foundation of Guangxi Province(No.2023GXNSFAA026025)+3 种基金to the Innovation Fund of Chinese Universities Industry-University-Research(ID:2021RYC06005)to the Research Project for Young and Middle-Aged Teachers in Guangxi Universities(ID:2020KY15013)to the Special Research Project of Hechi University(ID:2021GCC028)supported by the Project of Outstanding Thousand Young Teachers’Training in Higher Education Institutions of Guangxi,Guangxi Colleges and Universities Key Laboratory of AI and Information Processing(Hechi University),Education Department of Guangxi Zhuang Autonomous Region.
文摘In recent years,with the continuous deepening of smart city construction,there have been significant changes and improvements in the field of intelligent transportation.The semantic segmentation of road scenes has important practical significance in the fields of automatic driving,transportation planning,and intelligent transportation systems.However,the current mainstream lightweight semantic segmentation models in road scene segmentation face problems such as poor segmentation performance of small targets and insufficient refinement of segmentation edges.Therefore,this article proposes a lightweight semantic segmentation model based on the LiteSeg model improvement to address these issues.The model uses the lightweight backbone network MobileNet instead of the LiteSeg backbone network to reduce the network parameters and computation,and combines the Coordinate Attention(CA)mechanism to help the network capture long-distance dependencies.At the same time,by combining the dependencies of spatial information and channel information,the Spatial and Channel Network(SCNet)attention mechanism is proposed to improve the feature extraction ability of the model.Finally,a multiscale transposed attention encoding(MTAE)module was proposed to obtain features of different resolutions and perform feature fusion.In this paper,the proposed model is verified on the Cityscapes dataset.The experimental results show that the addition of SCNet and MTAE modules increases the mean Intersection over Union(mIoU)of the original LiteSeg model by 4.69%.On this basis,the backbone network is replaced with MobileNet,and the CA model is added at the same time.At the cost of increasing the minimum model parameters and computing costs,the mIoU of the original LiteSeg model is increased by 2.46%.This article also compares the proposed model with some current lightweight semantic segmentation models,and experiments show that the comprehensive performance of the proposed model is the best,especially in achieving excellent results in small object segmentation.Finally,this article will conduct generalization testing on the KITTI dataset for the proposed model,and the experimental results show that the proposed algorithm has a certain degree of generalization.
文摘公路洒落物是影响交通安全的重要因素之一,为了解决中小尺度公路洒落物检测中的漏检、误检以及难以定位等问题,本文提出了一种图像引导和点云空间约束的公路洒落物检测定位方法。该方法使用改进的YOLOv7-OD网络处理图像数据获取二维目标预测框信息,将目标预测框投影到激光雷达坐标系下得到锥形感兴趣区域(region of interest,ROI)。在ROI区域内的点云空间约束下,联合点云聚类和点云生成算法获得不同尺度的洒落物在三维空间中的检测定位结果。实验表明:改进的YOLOv7-OD网络在中尺度目标上的召回率和平均精度分别为85.4%和82.0%,相比YOLOv7网络分别提升6.6和8.0个百分点;在小尺度目标上的召回率和平均精度分别为66.8%和57.3%,均提升5.3个百分点;洒落物定位方面,对于距离检测车辆30~40 m处的目标,深度定位误差为0.19 m,角度定位误差为0.082°,实现了多尺度公路洒落物的检测和定位。