The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on ...The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on wind turbine blades,a blade surface defect detection and quantification method based on an improved Deeplabv3+deep learning model is proposed.Firstly,an improved method for wind turbine blade surface defect detection,utilizing Mobilenetv2 as the backbone feature extraction network,is proposed based on an original Deeplabv3+deep learning model to address the issue of limited robustness.Secondly,through integrating the concept of pre-trained weights from transfer learning and implementing a freeze training strategy,significant improvements have been made to enhance both the training speed and model training accuracy of this deep learning model.Finally,based on segmented blade surface defect images,a method for quantifying blade defects is proposed.This method combines image stitching algorithms to achieve overall quantification and risk assessment of the entire blade.Test results show that the improved Deeplabv3+deep learning model reduces training time by approximately 43.03%compared to the original model,while achieving mAP and MIoU values of 96.87%and 96.93%,respectively.Moreover,it demonstrates robustness in detecting different surface defects on blades across different back-grounds.The application of a blade surface defect quantification method enables the precise quantification of dif-ferent defects and facilitates the assessment of risk levels associated with defect measurements across the entire blade.This method enables non-contact,long-distance,high-precision detection and quantification of surface defects on the blades,providing a reference for assessing surface defects on wind turbine blades.展开更多
The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology...The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology to perform large-scale imaging of the Earth’s magnetosheath and polar cusp regions.It uses a high-precision ultraviolet imager to image the overall configuration of the aurora and monitor changes in the source of solar wind in real time,using in situ detection instruments to improve human understanding of the relationship between solar activity and changes in the Earth’s magnetic field.The SMILE satellite is scheduled to launch in 2025.The European Incoherent Scatter Sciences Association(EISCAT)-3D radar is a new generation of European incoherent scatter radar constructed by EISCAT and is the most advanced ground-based ionospheric experimental device in the high-latitude polar region.It has multibeam and multidirectional quasi-real-time three-dimensional(3D)imaging capabilities,continuous monitoring and operation capabilities,and multiple-baseline interferometry capabilities.Joint detection by the SMILE satellite and the EISCAT-3D radar is of great significance for revealing the coupling process of the solar wind–magnetosphere–ionosphere.Therefore,we performed an analysis of the joint detection capability of the SMILE satellite and EISCAT-3D,analyzed the period during which the two can perform joint detection,and defined the key scientific problems that can be solved by joint detection.In addition,we developed Web-based software to search for and visualize the joint detection period of the SMILE satellite and EISCAT-3D radar,which lays the foundation for subsequent joint detection experiments and scientific research.展开更多
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w...Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
To solve the problems in online target detection on the embedded platform,such as high missed detection rate,low accuracy,and slow speed,a lightweight target recognition method of MobileNetV3-CenterNet is proposed.Thi...To solve the problems in online target detection on the embedded platform,such as high missed detection rate,low accuracy,and slow speed,a lightweight target recognition method of MobileNetV3-CenterNet is proposed.This method combines the anchor-free Centernet network with the MobileNetV3 small network and is trained on the University at Albany Detection and Tracking(UA-DETRAC)and the Pattern Analysis,Statical Modeling and Computational Learn-ing Visual Object Classes(PASCAL VOC)07+12 standard datasets.While reducing the scale of the network model,the MobileNetV3-CenterNet model shows a good balance in the accuracy and speed of target recognition and effectively solves the problems of missing detection of dense and small targets in online detection.To verify the recognition performance of the model,it is tested on 2683 images of the UA-DETRAC and PASCAL VOC 07+12 datasets,and compared with the analysis results of CenterNet-Deep Layer Aggregation(DLA)34,CenterNet-Residual Network(ResNet)18,CenterNet-MobileNetV3-large,You Only Look Once vision 3(YOLOv3),MobileNetV2-YOLOv3,Single Shot Multibox Detector(SSD),MobileNetV2-SSD and Faster region convolutional neural network(RCNN)models.The results show that the MobileNetV3-CenterNet model accurately rec-ognized the dense targets and small targets missed by other methods,and obtained a recognition accuracy of 99.4%with a running speed at 53(on a server)and 14(on an ipad)frame/s,respectively.The MobileNetV3-CenterNet lightweight target recognition method will provide effective technical support for the target recognition of deep learning networks in embedded platforms and online detection.展开更多
In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection ...In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection.展开更多
The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpow...The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.展开更多
基金supported by the National Science Foundation of China(Grant Nos.52068049 and 51908266)the Science Fund for Distinguished Young Scholars of Gansu Province(No.21JR7RA267)Hongliu Outstanding Young Talents Program of Lanzhou University of Technology.
文摘The accumulation of defects on wind turbine blade surfaces can lead to irreversible damage,impacting the aero-dynamic performance of the blades.To address the challenge of detecting and quantifying surface defects on wind turbine blades,a blade surface defect detection and quantification method based on an improved Deeplabv3+deep learning model is proposed.Firstly,an improved method for wind turbine blade surface defect detection,utilizing Mobilenetv2 as the backbone feature extraction network,is proposed based on an original Deeplabv3+deep learning model to address the issue of limited robustness.Secondly,through integrating the concept of pre-trained weights from transfer learning and implementing a freeze training strategy,significant improvements have been made to enhance both the training speed and model training accuracy of this deep learning model.Finally,based on segmented blade surface defect images,a method for quantifying blade defects is proposed.This method combines image stitching algorithms to achieve overall quantification and risk assessment of the entire blade.Test results show that the improved Deeplabv3+deep learning model reduces training time by approximately 43.03%compared to the original model,while achieving mAP and MIoU values of 96.87%and 96.93%,respectively.Moreover,it demonstrates robustness in detecting different surface defects on blades across different back-grounds.The application of a blade surface defect quantification method enables the precise quantification of dif-ferent defects and facilitates the assessment of risk levels associated with defect measurements across the entire blade.This method enables non-contact,long-distance,high-precision detection and quantification of surface defects on the blades,providing a reference for assessing surface defects on wind turbine blades.
基金supported by the Stable-Support Scientific Project of the China Research Institute of Radio-wave Propagation(Grant No.A13XXXXWXX)the National Natural Science Foundation of China(Grant Nos.42174210,4207202,and 42188101)the Strategic Pioneer Program on Space Science,Chinese Academy of Sciences(Grant No.XDA15014800)。
文摘The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology to perform large-scale imaging of the Earth’s magnetosheath and polar cusp regions.It uses a high-precision ultraviolet imager to image the overall configuration of the aurora and monitor changes in the source of solar wind in real time,using in situ detection instruments to improve human understanding of the relationship between solar activity and changes in the Earth’s magnetic field.The SMILE satellite is scheduled to launch in 2025.The European Incoherent Scatter Sciences Association(EISCAT)-3D radar is a new generation of European incoherent scatter radar constructed by EISCAT and is the most advanced ground-based ionospheric experimental device in the high-latitude polar region.It has multibeam and multidirectional quasi-real-time three-dimensional(3D)imaging capabilities,continuous monitoring and operation capabilities,and multiple-baseline interferometry capabilities.Joint detection by the SMILE satellite and the EISCAT-3D radar is of great significance for revealing the coupling process of the solar wind–magnetosphere–ionosphere.Therefore,we performed an analysis of the joint detection capability of the SMILE satellite and EISCAT-3D,analyzed the period during which the two can perform joint detection,and defined the key scientific problems that can be solved by joint detection.In addition,we developed Web-based software to search for and visualize the joint detection period of the SMILE satellite and EISCAT-3D radar,which lays the foundation for subsequent joint detection experiments and scientific research.
基金supported by a grant from the National Key Research and Development Project(2023YFB4302100)Key Research and Development Project of Jiangxi Province(No.20232ACE01011)Independent Deployment Project of Ganjiang Innovation Research Institute,Chinese Academy of Sciences(E255J001).
文摘Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.
基金supported by Research and Development Project of Key Core Technology and Common Technology in Shanxi Province(No.2020XXX009).
文摘To solve the problems in online target detection on the embedded platform,such as high missed detection rate,low accuracy,and slow speed,a lightweight target recognition method of MobileNetV3-CenterNet is proposed.This method combines the anchor-free Centernet network with the MobileNetV3 small network and is trained on the University at Albany Detection and Tracking(UA-DETRAC)and the Pattern Analysis,Statical Modeling and Computational Learn-ing Visual Object Classes(PASCAL VOC)07+12 standard datasets.While reducing the scale of the network model,the MobileNetV3-CenterNet model shows a good balance in the accuracy and speed of target recognition and effectively solves the problems of missing detection of dense and small targets in online detection.To verify the recognition performance of the model,it is tested on 2683 images of the UA-DETRAC and PASCAL VOC 07+12 datasets,and compared with the analysis results of CenterNet-Deep Layer Aggregation(DLA)34,CenterNet-Residual Network(ResNet)18,CenterNet-MobileNetV3-large,You Only Look Once vision 3(YOLOv3),MobileNetV2-YOLOv3,Single Shot Multibox Detector(SSD),MobileNetV2-SSD and Faster region convolutional neural network(RCNN)models.The results show that the MobileNetV3-CenterNet model accurately rec-ognized the dense targets and small targets missed by other methods,and obtained a recognition accuracy of 99.4%with a running speed at 53(on a server)and 14(on an ipad)frame/s,respectively.The MobileNetV3-CenterNet lightweight target recognition method will provide effective technical support for the target recognition of deep learning networks in embedded platforms and online detection.
基金The authors would like to thank the financial support of Natural Science Foundation of Anhui Province(No.2208085MF173)the key research and development projects of Anhui(202104a05020003)+2 种基金the anhui development and reform commission supports R&D and innovation project([2020]479)the national natural science foundation of China(51575001)Anhui university scientific research platform innovation team building project(2016-2018).
文摘In complex traffic environment scenarios,it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance.The accuracy of 3D object detection will be affected by problems such as illumination changes,object occlusion,and object detection distance.To this purpose,we face these challenges by proposing a multimodal feature fusion network for 3D object detection(MFF-Net).In this research,this paper first uses the spatial transformation projection algorithm to map the image features into the feature space,so that the image features are in the same spatial dimension when fused with the point cloud features.Then,feature channel weighting is performed using an adaptive expression augmentation fusion network to enhance important network features,suppress useless features,and increase the directionality of the network to features.Finally,this paper increases the probability of false detection and missed detection in the non-maximum suppression algo-rithm by increasing the one-dimensional threshold.So far,this paper has constructed a complete 3D target detection network based on multimodal feature fusion.The experimental results show that the proposed achieves an average accuracy of 82.60%on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)dataset,outperforming previous state-of-the-art multimodal fusion networks.In Easy,Moderate,and hard evaluation indicators,the accuracy rate of this paper reaches 90.96%,81.46%,and 75.39%.This shows that the MFF-Net network has good performance in 3D object detection.
基金supported by the National Key Research and Development Program of China(2020YFB1807500)the National Natural Science Foundation of China(62072360,62001357,62172438,61901367)+4 种基金the key research and development plan of Shaanxi province(2021ZDLGY02-09,2023-GHZD-44,2023-ZDLGY-54)the Natural Science Foundation of Guangdong Province of China(2022A1515010988)Key Project on Artificial Intelligence of Xi'an Science and Technology Plan(2022JH-RGZN-0003,2022JH-RGZN-0103,2022JH-CLCJ-0053)Xi'an Science and Technology Plan(20RGZN0005)the Proof-ofconcept fund from Hangzhou Research Institute of Xidian University(GNYZ2023QC0201).
文摘The high bandwidth and low latency of 6G network technology enable the successful application of monocular 3D object detection on vehicle platforms.Monocular 3D-object-detection-based Pseudo-LiDAR is a low-cost,lowpower solution compared to LiDAR solutions in the field of autonomous driving.However,this technique has some problems,i.e.,(1)the poor quality of generated Pseudo-LiDAR point clouds resulting from the nonlinear error distribution of monocular depth estimation and(2)the weak representation capability of point cloud features due to the neglected global geometric structure features of point clouds existing in LiDAR-based 3D detection networks.Therefore,we proposed a Pseudo-LiDAR confidence sampling strategy and a hierarchical geometric feature extraction module for monocular 3D object detection.We first designed a point cloud confidence sampling strategy based on a 3D Gaussian distribution to assign small confidence to the points with great error in depth estimation and filter them out according to the confidence.Then,we present a hierarchical geometric feature extraction module by aggregating the local neighborhood features and a dual transformer to capture the global geometric features in the point cloud.Finally,our detection framework is based on Point-Voxel-RCNN(PV-RCNN)with high-quality Pseudo-LiDAR and enriched geometric features as input.From the experimental results,our method achieves satisfactory results in monocular 3D object detection.