The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology...The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology to perform large-scale imaging of the Earth’s magnetosheath and polar cusp regions.It uses a high-precision ultraviolet imager to image the overall configuration of the aurora and monitor changes in the source of solar wind in real time,using in situ detection instruments to improve human understanding of the relationship between solar activity and changes in the Earth’s magnetic field.The SMILE satellite is scheduled to launch in 2025.The European Incoherent Scatter Sciences Association(EISCAT)-3D radar is a new generation of European incoherent scatter radar constructed by EISCAT and is the most advanced ground-based ionospheric experimental device in the high-latitude polar region.It has multibeam and multidirectional quasi-real-time three-dimensional(3D)imaging capabilities,continuous monitoring and operation capabilities,and multiple-baseline interferometry capabilities.Joint detection by the SMILE satellite and the EISCAT-3D radar is of great significance for revealing the coupling process of the solar wind–magnetosphere–ionosphere.Therefore,we performed an analysis of the joint detection capability of the SMILE satellite and EISCAT-3D,analyzed the period during which the two can perform joint detection,and defined the key scientific problems that can be solved by joint detection.In addition,we developed Web-based software to search for and visualize the joint detection period of the SMILE satellite and EISCAT-3D radar,which lays the foundation for subsequent joint detection experiments and scientific research.展开更多
Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,w...Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.展开更多
Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
基金supported by the Stable-Support Scientific Project of the China Research Institute of Radio-wave Propagation(Grant No.A13XXXXWXX)the National Natural Science Foundation of China(Grant Nos.42174210,4207202,and 42188101)the Strategic Pioneer Program on Space Science,Chinese Academy of Sciences(Grant No.XDA15014800)。
文摘The Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)satellite is a small magnetosphere–ionosphere link explorer developed cooperatively between China and Europe.It pioneers the use of X-ray imaging technology to perform large-scale imaging of the Earth’s magnetosheath and polar cusp regions.It uses a high-precision ultraviolet imager to image the overall configuration of the aurora and monitor changes in the source of solar wind in real time,using in situ detection instruments to improve human understanding of the relationship between solar activity and changes in the Earth’s magnetic field.The SMILE satellite is scheduled to launch in 2025.The European Incoherent Scatter Sciences Association(EISCAT)-3D radar is a new generation of European incoherent scatter radar constructed by EISCAT and is the most advanced ground-based ionospheric experimental device in the high-latitude polar region.It has multibeam and multidirectional quasi-real-time three-dimensional(3D)imaging capabilities,continuous monitoring and operation capabilities,and multiple-baseline interferometry capabilities.Joint detection by the SMILE satellite and the EISCAT-3D radar is of great significance for revealing the coupling process of the solar wind–magnetosphere–ionosphere.Therefore,we performed an analysis of the joint detection capability of the SMILE satellite and EISCAT-3D,analyzed the period during which the two can perform joint detection,and defined the key scientific problems that can be solved by joint detection.In addition,we developed Web-based software to search for and visualize the joint detection period of the SMILE satellite and EISCAT-3D radar,which lays the foundation for subsequent joint detection experiments and scientific research.
基金supported by a grant from the National Key Research and Development Project(2023YFB4302100)Key Research and Development Project of Jiangxi Province(No.20232ACE01011)Independent Deployment Project of Ganjiang Innovation Research Institute,Chinese Academy of Sciences(E255J001).
文摘Aiming at the limitations of the existing railway foreign object detection methods based on two-dimensional(2D)images,such as short detection distance,strong influence of environment and lack of distance information,we propose Rail-PillarNet,a three-dimensional(3D)LIDAR(Light Detection and Ranging)railway foreign object detection method based on the improvement of PointPillars.Firstly,the parallel attention pillar encoder(PAPE)is designed to fully extract the features of the pillars and alleviate the problem of local fine-grained information loss in PointPillars pillars encoder.Secondly,a fine backbone network is designed to improve the feature extraction capability of the network by combining the coding characteristics of LIDAR point cloud feature and residual structure.Finally,the initial weight parameters of the model were optimised by the transfer learning training method to further improve accuracy.The experimental results on the OSDaR23 dataset show that the average accuracy of Rail-PillarNet reaches 58.51%,which is higher than most mainstream models,and the number of parameters is 5.49 M.Compared with PointPillars,the accuracy of each target is improved by 10.94%,3.53%,16.96%and 19.90%,respectively,and the number of parameters only increases by 0.64M,which achieves a balance between the number of parameters and accuracy.
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.