Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as ...Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.展开更多
基金Fundamental Research Fund in Heilongjiang Provincial Universities(Nos.135409602,135409102)。
文摘Semantic segmentation is for pixel-level classification tasks,and contextual information has an important impact on the performance of segmentation.In order to capture richer contextual information,we adopt ResNet as the backbone network and designs an encoder-decoder architecture based on multidimensional attention(MDA)module and multiscale upsampling(MSU)module.The MDA module calculates the attention matrices of the three dimensions to capture the dependency of each position,and adaptively captures the image features.The MSU module adopts parallel branches to capture the multiscale features of the images,and multiscale feature aggregation can enhance contextual information.A series of experiments demonstrate the validity of the model on Cityscapes and Camvid datasets.