While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information...While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information.Com-bining images obtained from both modalities allows for leveraging their respective strengths and mitigating individual limitations,resulting in high-quality images with enhanced contrast and rich texture details.Such capabilities hold promising applications in advanced visual tasks including target detection,instance segmentation,military surveillance,pedestrian detection,among others.This paper introduces a novel approach,a dual-branch decomposition fusion network based on AutoEncoder(AE),which decomposes multi-modal features into intensity and texture information for enhanced fusion.Local contrast enhancement module(CEM)and texture detail enhancement module(DEM)are devised to process the decomposed images,followed by image fusion through the decoder.The proposed loss function ensures effective retention of key information from the source images of both modalities.Extensive comparisons and generalization experiments demonstrate the superior performance of our network in preserving pixel intensity distribution and retaining texture details.From the qualitative results,we can see the advantages of fusion details and local contrast.In the quantitative experiments,entropy(EN),mutual information(MI),structural similarity(SSIM)and other results have improved and exceeded the SOTA(State of the Art)model as a whole.展开更多
基金Supported by the National Key Research and Development Program of China(2022YFA1404602)the Strategic Priority Research Program of the Chinese Academy of Sciences(Grant No.XDB0580000)+4 种基金the Key Deployment Projects of the Chinese Academy of Sciences(ZDRW-XH-2021-7-1)the National Natural Science Foundation of China(61975223,61991442,62305362,62075230)the Program of Shanghai Academic/Technology Research Leader(22XD1424400)the Shanghai Municipal Science and Technology Major Project(2019SHZDZX01)Natural Science Foundation of Shanghai(19ZR1465400)。
基金supported in part by the National Natural Science Foundation of China(Grant No.61971078)Chongqing Education Commission Science and Technology Major Project(No.KJZD-M202301901).
文摘While single-modal visible light images or infrared images provide limited information,infrared light captures significant thermal radiation data,whereas visible light excels in presenting detailed texture information.Com-bining images obtained from both modalities allows for leveraging their respective strengths and mitigating individual limitations,resulting in high-quality images with enhanced contrast and rich texture details.Such capabilities hold promising applications in advanced visual tasks including target detection,instance segmentation,military surveillance,pedestrian detection,among others.This paper introduces a novel approach,a dual-branch decomposition fusion network based on AutoEncoder(AE),which decomposes multi-modal features into intensity and texture information for enhanced fusion.Local contrast enhancement module(CEM)and texture detail enhancement module(DEM)are devised to process the decomposed images,followed by image fusion through the decoder.The proposed loss function ensures effective retention of key information from the source images of both modalities.Extensive comparisons and generalization experiments demonstrate the superior performance of our network in preserving pixel intensity distribution and retaining texture details.From the qualitative results,we can see the advantages of fusion details and local contrast.In the quantitative experiments,entropy(EN),mutual information(MI),structural similarity(SSIM)and other results have improved and exceeded the SOTA(State of the Art)model as a whole.