The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance g...The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.展开更多
Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a mult...Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a multitask multiscale residual attention network(MMRAN)to simultaneously solve the problem of accurately segmenting and classifying brain tumors.The proposed MMRAN is based on U-Net,and a parallel branch is added at the end of the encoder as the classification network.First,we propose a novel multiscale residual attention module(MRAM)that can aggregate contextual features and combine channel attention and spatial attention better and add it to the shared parameter layer of MMRAN.Second,we propose a method of dynamic weight training that can improve model performance while minimizing the need for multiple experiments to determine the optimal weights for each task.Finally,prior knowledge of brain tumors is added to the postprocessing of segmented images to further improve the segmentation accuracy.We evaluated MMRAN on a brain tumor data set containing meningioma,glioma,and pituitary tumors.In terms of segmentation performance,our method achieves Dice,Hausdorff distance(HD),mean intersection over union(MIoU),and mean pixel accuracy(MPA)values of 80.03%,6.649 mm,84.38%,and 89.41%,respectively.In terms of classification performance,our method achieves accuracy,recall,precision,and F1-score of 89.87%,90.44%,88.56%,and 89.49%,respectively.Compared with other networks,MMRAN performs better in segmentation and classification,which significantly aids medical professionals in brain tumor management.The code and data set are available at https://github.com/linkenfaqiu/MMRAN.展开更多
基金National Natural Science Foundation of China,Grant/Award Number:62106177supported by the Central University Basic Research Fund of China(No.2042020KF0016)supported by the supercomputing system in the Supercomputing Center of Wuhan University.
文摘The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.
基金This paper was supported by National Natural Science Foundation of China(No.61977063 and 61872020).The authors thank all the patients for providing their MRI images and School of Biomedical Engineering at Southern Medical University,China for providing the brain tumor data set.We appreciate Dr.Fenfen Li,Wenzhou Eye Hospital,Wenzhou Medical University,China,for her support with clinical consulting and language editing.
文摘Automatic segmentation and classification of brain tumors are of great importance to clinical treatment.However,they are challenging due to the varied and small morphology of the tumors.In this paper,we propose a multitask multiscale residual attention network(MMRAN)to simultaneously solve the problem of accurately segmenting and classifying brain tumors.The proposed MMRAN is based on U-Net,and a parallel branch is added at the end of the encoder as the classification network.First,we propose a novel multiscale residual attention module(MRAM)that can aggregate contextual features and combine channel attention and spatial attention better and add it to the shared parameter layer of MMRAN.Second,we propose a method of dynamic weight training that can improve model performance while minimizing the need for multiple experiments to determine the optimal weights for each task.Finally,prior knowledge of brain tumors is added to the postprocessing of segmented images to further improve the segmentation accuracy.We evaluated MMRAN on a brain tumor data set containing meningioma,glioma,and pituitary tumors.In terms of segmentation performance,our method achieves Dice,Hausdorff distance(HD),mean intersection over union(MIoU),and mean pixel accuracy(MPA)values of 80.03%,6.649 mm,84.38%,and 89.41%,respectively.In terms of classification performance,our method achieves accuracy,recall,precision,and F1-score of 89.87%,90.44%,88.56%,and 89.49%,respectively.Compared with other networks,MMRAN performs better in segmentation and classification,which significantly aids medical professionals in brain tumor management.The code and data set are available at https://github.com/linkenfaqiu/MMRAN.