目的因为有雨图像中雨线存在方向、密度和大小等各方面的差异,单幅图像去雨依旧是一个充满挑战的研究问题。现有算法在某些复杂图像上仍存在过度去雨或去雨不足等问题,部分复杂图像的边缘高频信息在去雨过程中被抹除,或图像中残留雨成...目的因为有雨图像中雨线存在方向、密度和大小等各方面的差异,单幅图像去雨依旧是一个充满挑战的研究问题。现有算法在某些复杂图像上仍存在过度去雨或去雨不足等问题,部分复杂图像的边缘高频信息在去雨过程中被抹除,或图像中残留雨成分。针对上述问题,本文提出三维注意力和Transformer去雨网络(three-dimension attention and Transformer deraining network,TDATDN)。方法将三维注意力机制与残差密集块结构相结合,以解决残差密集块通道高维度特征融合问题;使用Transformer计算特征全局关联性;针对去雨过程中图像高频信息被破坏和结构信息被抹除的问题,将多尺度结构相似性损失与常用图像去雨损失函数结合参与去雨网络训练。结果本文将提出的TDATDN网络在Rain12000雨线数据集上进行实验。其中,峰值信噪比(peak signal to noise ratio,PSNR)达到33.01 d B,结构相似性(structural similarity,SSIM)达到0.9278。实验结果表明,本文算法对比以往基于深度学习的神经网络去雨算法,显著改善了单幅图像去雨效果。结论本文提出的TDATDN图像去雨网络结合了3D注意力机制、Transformer和编码器—解码器架构的优点,可较好地完成单幅图像去雨工作。展开更多
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.I...The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.展开更多
针对三维重建对细小特征及边缘区域重建欠佳的问题,提出了一个基于特征对齐与上下文引导的多视图三维重建网络,即AGA-MVSNet。首先,构建了一个特征对齐模块(FA)与特征选择模块(FS),能够将特征金字塔不同层级的特征先对齐之后再进行融合...针对三维重建对细小特征及边缘区域重建欠佳的问题,提出了一个基于特征对齐与上下文引导的多视图三维重建网络,即AGA-MVSNet。首先,构建了一个特征对齐模块(FA)与特征选择模块(FS),能够将特征金字塔不同层级的特征先对齐之后再进行融合,提高对小尺寸物体和边缘区域的特征提取能力;然后,在代价体正则化中加入了一个上下文引导模块,该模块能够在略微增加运行内存的情况下充分利用周围信息,增强成本体积之间的相关性,提高三维重建的精度与完整度;最后,在DTU数据集上进行了实验,实验结果表明,该方法相比于基准网络CasMVSNet精度提升了2.2%,整体重建质量提升了2.5%。此外,在Tanks and Temples数据集上的表现相较一些已知的方法也十分优异,且在BlendedMVS数据集上也生成了不错的点云效果。展开更多
文摘目的因为有雨图像中雨线存在方向、密度和大小等各方面的差异,单幅图像去雨依旧是一个充满挑战的研究问题。现有算法在某些复杂图像上仍存在过度去雨或去雨不足等问题,部分复杂图像的边缘高频信息在去雨过程中被抹除,或图像中残留雨成分。针对上述问题,本文提出三维注意力和Transformer去雨网络(three-dimension attention and Transformer deraining network,TDATDN)。方法将三维注意力机制与残差密集块结构相结合,以解决残差密集块通道高维度特征融合问题;使用Transformer计算特征全局关联性;针对去雨过程中图像高频信息被破坏和结构信息被抹除的问题,将多尺度结构相似性损失与常用图像去雨损失函数结合参与去雨网络训练。结果本文将提出的TDATDN网络在Rain12000雨线数据集上进行实验。其中,峰值信噪比(peak signal to noise ratio,PSNR)达到33.01 d B,结构相似性(structural similarity,SSIM)达到0.9278。实验结果表明,本文算法对比以往基于深度学习的神经网络去雨算法,显著改善了单幅图像去雨效果。结论本文提出的TDATDN图像去雨网络结合了3D注意力机制、Transformer和编码器—解码器架构的优点,可较好地完成单幅图像去雨工作。
基金National Natural Science Foundation of China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)University Superior Discipline Construction Project of Jiangsu Province。
文摘The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object detection.Inspired by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional Transformer.We first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud processing.Then we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the voxel.Lastly,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object detection.The voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature extractor.Experiments results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.
文摘针对三维重建对细小特征及边缘区域重建欠佳的问题,提出了一个基于特征对齐与上下文引导的多视图三维重建网络,即AGA-MVSNet。首先,构建了一个特征对齐模块(FA)与特征选择模块(FS),能够将特征金字塔不同层级的特征先对齐之后再进行融合,提高对小尺寸物体和边缘区域的特征提取能力;然后,在代价体正则化中加入了一个上下文引导模块,该模块能够在略微增加运行内存的情况下充分利用周围信息,增强成本体积之间的相关性,提高三维重建的精度与完整度;最后,在DTU数据集上进行了实验,实验结果表明,该方法相比于基准网络CasMVSNet精度提升了2.2%,整体重建质量提升了2.5%。此外,在Tanks and Temples数据集上的表现相较一些已知的方法也十分优异,且在BlendedMVS数据集上也生成了不错的点云效果。