Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input t...Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.展开更多
Identification of faulty feeders in resonant grounding distribution networks remains a significant challenge dueto the weak fault current and complicated working conditions.In this paper, we present a deep learning-ba...Identification of faulty feeders in resonant grounding distribution networks remains a significant challenge dueto the weak fault current and complicated working conditions.In this paper, we present a deep learning-based multi-labelclassification framework to reliably distinguish the faulty feeder.Three different neural networks (NNs) including the multilayerperceptron, one-dimensional convolutional neural network (1DCNN), and 2D CNN are built. However, the labeled data maybe difficult to obtain in the actual environment. We use thesimplified simulation model based on a full-scale test field (FSTF)to obtain sufficient labeled source data. Being different frommost learning-based methods, assuming that the distribution ofsource domain and target domain is identical, we propose asamples-based transfer learning method to improve the domainadaptation by using samples in the source domain with properweights. The TrAdaBoost algorithm is adopted to update theweights of each sample. The recorded data obtained in the FSTFare utilized to test the domain adaptability. According to ourvalidation and testing, the validation accuracies are high whenthere is sufficient labeled data for training the proposed NNs.The proposed 2D CNN has the best domain adaptability. TheTrAdaBoost algorithm can help the NNs to train an efficientclassifier that has better domain adaptation. It has been thereforeconcluded that the proposed method, especially the 2D CNN, issuitable for actual distribution networks.展开更多
基金supported in part by the Major Project for New Generation of AI (2018AAA0100400)the National Natural Science Foundation of China (61836014,U21B2042,62072457,62006231)the InnoHK Program。
文摘Monocular 3D object detection is challenging due to the lack of accurate depth information.Some methods estimate the pixel-wise depth maps from off-the-shelf depth estimators and then use them as an additional input to augment the RGB images.Depth-based methods attempt to convert estimated depth maps to pseudo-LiDAR and then use LiDAR-based object detectors or focus on the perspective of image and depth fusion learning.However,they demonstrate limited performance and efficiency as a result of depth inaccuracy and complex fusion mode with convolutions.Different from these approaches,our proposed depth-guided vision transformer with a normalizing flows(NF-DVT)network uses normalizing flows to build priors in depth maps to achieve more accurate depth information.Then we develop a novel Swin-Transformer-based backbone with a fusion module to process RGB image patches and depth map patches with two separate branches and fuse them using cross-attention to exchange information with each other.Furthermore,with the help of pixel-wise relative depth values in depth maps,we develop new relative position embeddings in the cross-attention mechanism to capture more accurate sequence ordering of input tokens.Our method is the first Swin-Transformer-based backbone architecture for monocular 3D object detection.The experimental results on the KITTI and the challenging Waymo Open datasets show the effectiveness of our proposed method and superior performance over previous counterparts.
基金the Key Program of the Chinese Academy of Sciences under Grant QYZDJ-SSW-JSC025in part by the National Natural Science Foundation of China under Grant 51721005,and in part by the Chinese Scholarship Council(CSC).
文摘Identification of faulty feeders in resonant grounding distribution networks remains a significant challenge dueto the weak fault current and complicated working conditions.In this paper, we present a deep learning-based multi-labelclassification framework to reliably distinguish the faulty feeder.Three different neural networks (NNs) including the multilayerperceptron, one-dimensional convolutional neural network (1DCNN), and 2D CNN are built. However, the labeled data maybe difficult to obtain in the actual environment. We use thesimplified simulation model based on a full-scale test field (FSTF)to obtain sufficient labeled source data. Being different frommost learning-based methods, assuming that the distribution ofsource domain and target domain is identical, we propose asamples-based transfer learning method to improve the domainadaptation by using samples in the source domain with properweights. The TrAdaBoost algorithm is adopted to update theweights of each sample. The recorded data obtained in the FSTFare utilized to test the domain adaptability. According to ourvalidation and testing, the validation accuracies are high whenthere is sufficient labeled data for training the proposed NNs.The proposed 2D CNN has the best domain adaptability. TheTrAdaBoost algorithm can help the NNs to train an efficientclassifier that has better domain adaptation. It has been thereforeconcluded that the proposed method, especially the 2D CNN, issuitable for actual distribution networks.