Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose ...Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.展开更多
Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous human...Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.展开更多
Current data-driven deep learning(DL)methods typically reconstruct subsurface velocity models directly from pre-stack seismic records.However,these purely data-driven methods are often less robust and produce results ...Current data-driven deep learning(DL)methods typically reconstruct subsurface velocity models directly from pre-stack seismic records.However,these purely data-driven methods are often less robust and produce results that are less physically interpretative.Here,the authors propose a new method that uses migration images as input,combined with convolutional neural networks to construct high-resolution velocity models.Compared to directly using pre-stack seismic records as input,the nonlinearity between migration images and velocity models is significantly reduced.Additionally,the advantage of using migration images lies in its ability to more comprehensively capture the reflective properties of the subsurface medium,including amplitude and phase information,thereby to provide richer physical information in guiding the reconstruction of the velocity model.This approach not only improves the accuracy and resolution of the reconstructed velocity models,but also enhances the physical interpretability and robustness.Numerical experiments on synthetic data show that the proposed method has superior reconstruction performance and strong generalization capability when dealing with complex geological structures,and shows great potential in providing efficient solutions for the task of reconstructing high-wavenumber components.展开更多
BACKGROUND Intracranial atherosclerosis,a leading cause of stroke,involves arterial plaque formation.This study explores the link between plaque remodelling patterns and diabetes using high-resolution vessel wall imag...BACKGROUND Intracranial atherosclerosis,a leading cause of stroke,involves arterial plaque formation.This study explores the link between plaque remodelling patterns and diabetes using high-resolution vessel wall imaging(HR-VWI).AIM To investigate the factors of intracranial atherosclerotic remodelling patterns and the relationship between intracranial atherosclerotic remodelling and diabetes mellitus using HR-VWI.METHODS Ninety-four patients diagnosed with middle cerebral artery or basilar artery INTRODUCTION Intracranial atherosclerotic disease is one of the main causes of ischaemic stroke in the world,accounting for approx-imately 10%of transient ischaemic attacks and 30%-50%of ischaemic strokes[1].It is the most common factor among Asian people[2].The adaptive changes in the structure and function of blood vessels that can adapt to changes in the internal and external environment are called vascular remodelling,which is a common and important pathological mechanism in atherosclerotic diseases,and the remodelling mode of atherosclerotic plaques is closely related to the occurrence of stroke.Positive remodelling(PR)is an outwards compensatory remodelling where the arterial wall grows outwards in an attempt to maintain a constant lumen diameter.For a long time,it was believed that the degree of stenosis can accurately reflect the risk of ischaemic stroke[3-5].Previous studies have revealed that lesions without significant luminal stenosis can also lead to acute events[6,7],as summarized in a recent meta-analysis study in which approximately 50%of acute/subacute ischaemic events were due to this type of lesion[6].Research[8,9]has pointed out that the PR of plaques is more dangerous and more likely to cause acute ischaemic stroke.Previous studies[10-13]have found that there are specific vascular remodelling phenomena in the coronary and carotid arteries of diabetic patients.However,due to the deep location and small lumen of intracranial arteries and limitations of imaging techniques,the relationship between intracranial arterial remodelling and diabetes is still unclear.In recent years,with the development of magnetic resonance technology and the emergence of high-resolution(HR)vascular wall imaging,a clear and multidimensional display of the intracranial vascular wall has been achieved.Therefore,in this study,HR wall imaging(HR-VWI)was used to display the remodelling characteristics of bilateral middle cerebral arteries and basilar arteries and to explore the factors of intracranial vascular remodelling and its relationship with diabetes.展开更多
BACKGROUND No studies have yet been conducted on changes in microcirculatory hemody-namics of colorectal adenomas in vivo under endoscopy.The microcirculation of the colorectal adenoma could be observed in vivo by a n...BACKGROUND No studies have yet been conducted on changes in microcirculatory hemody-namics of colorectal adenomas in vivo under endoscopy.The microcirculation of the colorectal adenoma could be observed in vivo by a novel high-resolution magnification endoscopy with blue laser imaging(BLI),thus providing a new insight into the microcirculation of early colon tumors.AIM To observe the superficial microcirculation of colorectal adenomas using the novel magnifying colonoscope with BLI and quantitatively analyzed the changes in hemodynamic parameters.METHODS From October 2019 to January 2020,11 patients were screened for colon adenomas with the novel high-resolution magnification endoscope with BLI.Video images were recorded and processed with Adobe Premiere,Adobe Photoshop and Image-pro Plus software.Four microcirculation parameters:Microcirculation vessel density(MVD),mean vessel width(MVW)with width standard deviation(WSD),and blood flow velocity(BFV),were calculated for adenomas and the surrounding normal mucosa.RESULTS A total of 16 adenomas were identified.Compared with the normal surrounding mucosa,the superficial vessel density in the adenomas was decreased(MVD:0.95±0.18 vs 1.17±0.28μm/μm2,P<0.05).MVW(5.11±1.19 vs 4.16±0.76μm,P<0.05)and WSD(11.94±3.44 vs 9.04±3.74,P<0.05)were both increased.BFV slowed in the adenomas(709.74±213.28 vs 1256.51±383.31μm/s,P<0.05).CONCLUSION The novel high-resolution magnification endoscope with BLI can be used for in vivo study of adenoma superficial microcirculation.Superficial vessel density was decreased,more irregular,with slower blood flow.展开更多
Counterfeiting of modern banknotes poses a significant challenge,prompting the use of various preventive measures.One such measure is the magnetic anti-counterfeiting strip.However,due to its inherent weak magnetic pr...Counterfeiting of modern banknotes poses a significant challenge,prompting the use of various preventive measures.One such measure is the magnetic anti-counterfeiting strip.However,due to its inherent weak magnetic properties,visualizing its magnetic distribution has been a longstanding challenge.In this work,we introduce an innovative method by using a fiber optic diamond probe,a highly sensitive quantum sensor designed specifically for detecting extremely weak magnetic fields.We employ this probe to achieve high-resolution imaging of the magnetic fields associated with the RMB 50denomination anti-counterfeiting strip.Additionally,we conduct computer simulations by using COMSOL Multiphysics software to deduce the potential geometric characteristics and material composition of the magnetic region within the anti-counterfeiting strip.The findings and method presented in this study hold broader significance,extending the RMB 50 denomination to various denominations of the Chinese currency and other items that employ magnetic anti-counterfeiting strips.These advances have the potential to significantly improve and promote security measures in order to prevent the banknotes from being counterfeited.展开更多
The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resoluti...The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.展开更多
To address the problem of the low accuracy of transverse velocity field measurements for small targets in highresolution solar images,we proposed a novel velocity field measurement method for high-resolution solar ima...To address the problem of the low accuracy of transverse velocity field measurements for small targets in highresolution solar images,we proposed a novel velocity field measurement method for high-resolution solar images based on PWCNet.This method transforms the transverse velocity field measurements into an optical flow field prediction problem.We evaluated the performance of the proposed method using the Hαand TiO data sets obtained from New Vacuum Solar Telescope observations.The experimental results show that our method effectively predicts the optical flow of small targets in images compared with several typical machine-and deeplearning methods.On the Hαdata set,the proposed method improves the image structure similarity from 0.9182 to0.9587 and reduces the mean of residuals from 24.9931 to 15.2818;on the TiO data set,the proposed method improves the image structure similarity from 0.9289 to 0.9628 and reduces the mean of residuals from 25.9908 to17.0194.The optical flow predicted using the proposed method can provide accurate data for the atmospheric motion information of solar images.The code implementing the proposed method is available on https://github.com/lygmsy123/transverse-velocity-field-measurement.展开更多
Dear Editor,This letter proposes to integrate dendritic learnable network architecture with Vision Transformer to improve the accuracy of image recognition.In this study,based on the theory of dendritic neurons in neu...Dear Editor,This letter proposes to integrate dendritic learnable network architecture with Vision Transformer to improve the accuracy of image recognition.In this study,based on the theory of dendritic neurons in neuroscience,we design a network that is more practical for engineering to classify visual features.Based on this,we propose a dendritic learning-incorporated vision Transformer(DVT),which out-performs other state-of-the-art methods on three image recognition benchmarks.展开更多
A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The ne...A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.展开更多
Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosph...Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)aims to capture two-dimensional(2-D)images of the Earth’s magnetosheath by using soft X-ray imaging.However,the observed 2-D images are affected by many noise factors,destroying the contained information,which is not conducive to the subsequent reconstruction of the three-dimensional(3-D)structure of the magnetopause.The analysis of SXI-simulated observation images shows that such damage cannot be evaluated with traditional restoration models.This makes it difficult to establish the mapping relationship between SXIsimulated observation images and target images by using mathematical models.We propose an image restoration algorithm for SXIsimulated observation images that can recover large-scale structure information on the magnetosphere.The idea is to train a patch estimator by selecting noise–clean patch pairs with the same distribution through the Classification–Expectation Maximization algorithm to achieve the restoration estimation of the SXI-simulated observation image,whose mapping relationship with the target image is established by the patch estimator.The Classification–Expectation Maximization algorithm is used to select multiple patch clusters with the same distribution and then train different patch estimators so as to improve the accuracy of the estimator.Experimental results showed that our image restoration algorithm is superior to other classical image restoration algorithms in the SXI-simulated observation image restoration task,according to the peak signal-to-noise ratio and structural similarity.The restoration results of SXI-simulated observation images are used in the tangent fitting approach and the computed tomography approach toward magnetospheric reconstruction techniques,significantly improving the reconstruction results.Hence,the proposed technology may be feasible for processing SXI-simulated observation images.展开更多
The Soft X-ray Imager(SXI)is part of the scientific payload of the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.SMILE is a joint science mission between the European Space Agency(ESA)and the Chinese...The Soft X-ray Imager(SXI)is part of the scientific payload of the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.SMILE is a joint science mission between the European Space Agency(ESA)and the Chinese Academy of Sciences(CAS)and is due for launch in 2025.SXI is a compact X-ray telescope with a wide field-of-view(FOV)capable of encompassing large portions of Earth’s magnetosphere from the vantage point of the SMILE orbit.SXI is sensitive to the soft X-rays produced by the Solar Wind Charge eXchange(SWCX)process produced when heavy ions of solar wind origin interact with neutral particles in Earth’s exosphere.SWCX provides a mechanism for boundary detection within the magnetosphere,such as the position of Earth’s magnetopause,because the solar wind heavy ions have a very low density in regions of closed magnetic field lines.The sensitivity of the SXI is such that it can potentially track movements of the magnetopause on timescales of a few minutes and the orbit of SMILE will enable such movements to be tracked for segments lasting many hours.SXI is led by the University of Leicester in the United Kingdom(UK)with collaborating organisations on hardware,software and science support within the UK,Europe,China and the United States.展开更多
Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but...Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.展开更多
Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts ...Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts and restore texture completely in OCT images.We proposed a deep learning-based inpainting method of saturation artifacts in this paper.The generation mechanism of saturation artifacts was analyzed,and experimental and simulated datasets were built based on the mechanism.Enhanced super-resolution generative adversarial networks were trained by the clear–saturated phantom image pairs.The perfect reconstructed results of experimental zebrafish and thyroid OCT images proved its feasibility,strong generalization,and robustness.展开更多
Throughout the SMILE mission the satellite will be bombarded by radiation which gradually damages the focal plane devices and degrades their performance.In order to understand the changes of the CCD370s within the sof...Throughout the SMILE mission the satellite will be bombarded by radiation which gradually damages the focal plane devices and degrades their performance.In order to understand the changes of the CCD370s within the soft X-ray Imager,an initial characterisation of the devices has been carried out to give a baseline performance level.Three CCDs have been characterised,the two flight devices and the flight spa re.This has been carried out at the Open University in a bespo ke cleanroom measure ment facility.The results show that there is a cluster of bright pixels in the flight spa re which increases in size with tempe rature.However at the nominal ope rating tempe rature(-120℃) it is within the procure ment specifications.Overall,the devices meet the specifications when ope rating at -120℃ in 6 × 6 binned frame transfer science mode.The se rial charge transfer inefficiency degrades with temperature in full frame mode.However any charge losses are recovered when binning/frame transfer is implemented.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ...Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ice class often navigate in channels opened up by icebreakers.Navigation in the ice channel often depends on good maneuverability skills and abundant experience from the captain to a large extent.The ship may get stuck if steered into ice fields off the channel.Under this circumstance,it is very important to study how to identify the boundary lines of ice channels with a reliable method.In this paper,a two-staged ice channel identification method is developed based on image segmentation and corner point regression.The first stage employs the image segmentation method to extract channel regions.In the second stage,an intelligent corner regression network is proposed to extract the channel boundary lines from the channel region.A non-intelligent angle-based filtering and clustering method is proposed and compared with corner point regression network.The training and evaluation of the segmentation method and corner regression network are carried out on the synthetic and real ice channel dataset.The evaluation results show that the accuracy of the method using the corner point regression network in the second stage is achieved as high as 73.33%on the synthetic ice channel dataset and 70.66%on the real ice channel dataset,and the processing speed can reach up to 14.58frames per second.展开更多
Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images ha...Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images have a large amount of low-quality data,which seriously affects the performance of segmentationmethods.Therefore,this paper proposes an effective segmentation method for OCT fundus image watermarking using a rough convolutional neural network(RCNN).First,the rough-set-based feature discretization module is designed to preprocess the input data.Second,a dual attention mechanism for feature channels and spatial regions in the CNN is added to enable the model to adaptively select important information for fusion.Finally,the refinement module for enhancing the extraction power of multi-scale information is added to improve the edge accuracy in segmentation.RCNN is compared with CE-Net and MultiResUNet on 83 gold standard 3D retinal OCT data samples.The average dice similarly coefficient(DSC)obtained by RCNN is 6%higher than that of CE-Net.The average 95 percent Hausdorff distance(95HD)and average symmetric surface distance(ASD)obtained by RCNN are 32.4%and 33.3%lower than those of MultiResUNet,respectively.We also evaluate the effect of feature discretization,as well as analyze the initial learning rate of RCNN and conduct ablation experiments with the four different models.The experimental results indicate that our method can improve the segmentation accuracy of OCT fundus images,providing strong support for its application in medical image watermarking.展开更多
Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely exp...Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely express predicted high‐quality images for complex scenes.A dynamic network for image super‐resolution(DSRNet)is presented,which contains a residual enhancement block,wide enhancement block,feature refine-ment block and construction block.The residual enhancement block is composed of a residual enhanced architecture to facilitate hierarchical features for image super‐resolution.To enhance robustness of obtained super‐resolution model for complex scenes,a wide enhancement block achieves a dynamic architecture to learn more robust information to enhance applicability of an obtained super‐resolution model for varying scenes.To prevent interference of components in a wide enhancement block,a refine-ment block utilises a stacked architecture to accurately learn obtained features.Also,a residual learning operation is embedded in the refinement block to prevent long‐term dependency problem.Finally,a construction block is responsible for reconstructing high‐quality images.Designed heterogeneous architecture can not only facilitate richer structural information,but also be lightweight,which is suitable for mobile digital devices.Experimental results show that our method is more competitive in terms of performance,recovering time of image super‐resolution and complexity.The code of DSRNet can be obtained at https://github.com/hellloxiaotian/DSRNet.展开更多
To address the issues of incomplete information,blurred details,loss of details,and insufficient contrast in infrared and visible image fusion,an image fusion algorithm based on a convolutional autoencoder is proposed...To address the issues of incomplete information,blurred details,loss of details,and insufficient contrast in infrared and visible image fusion,an image fusion algorithm based on a convolutional autoencoder is proposed.The region attention module is meant to extract the background feature map based on the distinct properties of the background feature map and the detail feature map.A multi-scale convolution attention module is suggested to enhance the communication of feature information.At the same time,the feature transformation module is introduced to learn more robust feature representations,aiming to preserve the integrity of image information.This study uses three available datasets from TNO,FLIR,and NIR to perform thorough quantitative and qualitative trials with five additional algorithms.The methods are assessed based on four indicators:information entropy(EN),standard deviation(SD),spatial frequency(SF),and average gradient(AG).Object detection experiments were done on the M3FD dataset to further verify the algorithm’s performance in comparison with five other algorithms.The algorithm’s accuracy was evaluated using the mean average precision at a threshold of 0.5(mAP@0.5)index.Comprehensive experimental findings show that CAEFusion performs well in subjective visual and objective evaluation criteria and has promising potential in downstream object detection tasks.展开更多
基金funded by the Chongqing Normal University Startup Foundation for PhD(22XLB021)supported by the Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2023B40).
文摘Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.
基金the National Natural Science Foundation of China(42001408,61806097).
文摘Significant advancements have been achieved in road surface extraction based on high-resolution remote sensingimage processing. Most current methods rely on fully supervised learning, which necessitates enormous humaneffort to label the image. Within this field, other research endeavors utilize weakly supervised methods. Theseapproaches aim to reduce the expenses associated with annotation by leveraging sparsely annotated data, such asscribbles. This paper presents a novel technique called a weakly supervised network using scribble-supervised andedge-mask (WSSE-net). This network is a three-branch network architecture, whereby each branch is equippedwith a distinct decoder module dedicated to road extraction tasks. One of the branches is dedicated to generatingedge masks using edge detection algorithms and optimizing road edge details. The other two branches supervise themodel’s training by employing scribble labels and spreading scribble information throughout the image. To addressthe historical flaw that created pseudo-labels that are not updated with network training, we use mixup to blendprediction results dynamically and continually update new pseudo-labels to steer network training. Our solutiondemonstrates efficient operation by simultaneously considering both edge-mask aid and dynamic pseudo-labelsupport. The studies are conducted on three separate road datasets, which consist primarily of high-resolutionremote-sensing satellite photos and drone images. The experimental findings suggest that our methodologyperforms better than advanced scribble-supervised approaches and specific traditional fully supervised methods.
文摘Current data-driven deep learning(DL)methods typically reconstruct subsurface velocity models directly from pre-stack seismic records.However,these purely data-driven methods are often less robust and produce results that are less physically interpretative.Here,the authors propose a new method that uses migration images as input,combined with convolutional neural networks to construct high-resolution velocity models.Compared to directly using pre-stack seismic records as input,the nonlinearity between migration images and velocity models is significantly reduced.Additionally,the advantage of using migration images lies in its ability to more comprehensively capture the reflective properties of the subsurface medium,including amplitude and phase information,thereby to provide richer physical information in guiding the reconstruction of the velocity model.This approach not only improves the accuracy and resolution of the reconstructed velocity models,but also enhances the physical interpretability and robustness.Numerical experiments on synthetic data show that the proposed method has superior reconstruction performance and strong generalization capability when dealing with complex geological structures,and shows great potential in providing efficient solutions for the task of reconstructing high-wavenumber components.
基金Supported by National Natural Science Foundation of China,No.82071871Guangdong Basic and Applied Basic Research Foundation,No.2021A1515220131+1 种基金Guangdong Medical Science and Technology Research Fund Project,No.2022111520491834Clinical Research Project of Shenzhen Second People's Hospital,No.20223357022。
文摘BACKGROUND Intracranial atherosclerosis,a leading cause of stroke,involves arterial plaque formation.This study explores the link between plaque remodelling patterns and diabetes using high-resolution vessel wall imaging(HR-VWI).AIM To investigate the factors of intracranial atherosclerotic remodelling patterns and the relationship between intracranial atherosclerotic remodelling and diabetes mellitus using HR-VWI.METHODS Ninety-four patients diagnosed with middle cerebral artery or basilar artery INTRODUCTION Intracranial atherosclerotic disease is one of the main causes of ischaemic stroke in the world,accounting for approx-imately 10%of transient ischaemic attacks and 30%-50%of ischaemic strokes[1].It is the most common factor among Asian people[2].The adaptive changes in the structure and function of blood vessels that can adapt to changes in the internal and external environment are called vascular remodelling,which is a common and important pathological mechanism in atherosclerotic diseases,and the remodelling mode of atherosclerotic plaques is closely related to the occurrence of stroke.Positive remodelling(PR)is an outwards compensatory remodelling where the arterial wall grows outwards in an attempt to maintain a constant lumen diameter.For a long time,it was believed that the degree of stenosis can accurately reflect the risk of ischaemic stroke[3-5].Previous studies have revealed that lesions without significant luminal stenosis can also lead to acute events[6,7],as summarized in a recent meta-analysis study in which approximately 50%of acute/subacute ischaemic events were due to this type of lesion[6].Research[8,9]has pointed out that the PR of plaques is more dangerous and more likely to cause acute ischaemic stroke.Previous studies[10-13]have found that there are specific vascular remodelling phenomena in the coronary and carotid arteries of diabetic patients.However,due to the deep location and small lumen of intracranial arteries and limitations of imaging techniques,the relationship between intracranial arterial remodelling and diabetes is still unclear.In recent years,with the development of magnetic resonance technology and the emergence of high-resolution(HR)vascular wall imaging,a clear and multidimensional display of the intracranial vascular wall has been achieved.Therefore,in this study,HR wall imaging(HR-VWI)was used to display the remodelling characteristics of bilateral middle cerebral arteries and basilar arteries and to explore the factors of intracranial vascular remodelling and its relationship with diabetes.
基金This study was approved by the Medical Ethics Committee of Beijing Tsinghua Changgung Hospital(20002-0-02).
文摘BACKGROUND No studies have yet been conducted on changes in microcirculatory hemody-namics of colorectal adenomas in vivo under endoscopy.The microcirculation of the colorectal adenoma could be observed in vivo by a novel high-resolution magnification endoscopy with blue laser imaging(BLI),thus providing a new insight into the microcirculation of early colon tumors.AIM To observe the superficial microcirculation of colorectal adenomas using the novel magnifying colonoscope with BLI and quantitatively analyzed the changes in hemodynamic parameters.METHODS From October 2019 to January 2020,11 patients were screened for colon adenomas with the novel high-resolution magnification endoscope with BLI.Video images were recorded and processed with Adobe Premiere,Adobe Photoshop and Image-pro Plus software.Four microcirculation parameters:Microcirculation vessel density(MVD),mean vessel width(MVW)with width standard deviation(WSD),and blood flow velocity(BFV),were calculated for adenomas and the surrounding normal mucosa.RESULTS A total of 16 adenomas were identified.Compared with the normal surrounding mucosa,the superficial vessel density in the adenomas was decreased(MVD:0.95±0.18 vs 1.17±0.28μm/μm2,P<0.05).MVW(5.11±1.19 vs 4.16±0.76μm,P<0.05)and WSD(11.94±3.44 vs 9.04±3.74,P<0.05)were both increased.BFV slowed in the adenomas(709.74±213.28 vs 1256.51±383.31μm/s,P<0.05).CONCLUSION The novel high-resolution magnification endoscope with BLI can be used for in vivo study of adenoma superficial microcirculation.Superficial vessel density was decreased,more irregular,with slower blood flow.
基金Project supported by the National Key Research and Development Program of China (Grant No.2021YFB2012600)the Shanghai Aerospace Science and Technology Innovation Fund,China (Grant No.SAST-2022-102)。
文摘Counterfeiting of modern banknotes poses a significant challenge,prompting the use of various preventive measures.One such measure is the magnetic anti-counterfeiting strip.However,due to its inherent weak magnetic properties,visualizing its magnetic distribution has been a longstanding challenge.In this work,we introduce an innovative method by using a fiber optic diamond probe,a highly sensitive quantum sensor designed specifically for detecting extremely weak magnetic fields.We employ this probe to achieve high-resolution imaging of the magnetic fields associated with the RMB 50denomination anti-counterfeiting strip.Additionally,we conduct computer simulations by using COMSOL Multiphysics software to deduce the potential geometric characteristics and material composition of the magnetic region within the anti-counterfeiting strip.The findings and method presented in this study hold broader significance,extending the RMB 50 denomination to various denominations of the Chinese currency and other items that employ magnetic anti-counterfeiting strips.These advances have the potential to significantly improve and promote security measures in order to prevent the banknotes from being counterfeited.
基金National Natural Science Foundation of China(No.41871305)National Key Research and Development Program of China(No.2017YFC0602204)+2 种基金Fundamental Research Funds for the Central Universities,China University of Geosciences(Wuhan)(No.CUGQY1945)Open Fund of Key Laboratory of Geological Survey and Evaluation of Ministry of Education and the Fundamental Research Funds for the Central Universities(No.GLAB2019ZR02)Open Fund of Laboratory of Urban Land Resources Monitoring and Simulation,Ministry of Natural Resources,China(No.KF-2020-05-068)。
文摘The exploration of building detection plays an important role in urban planning,smart city and military.Aiming at the problem of high overlapping ratio of detection frames for dense building detection in high resolution remote sensing images,we present an effective YOLOv3 framework,corner regression-based YOLOv3(Correg-YOLOv3),to localize dense building accurately.This improved YOLOv3 algorithm establishes a vertex regression mechanism and an additional loss item about building vertex offsets relative to the center point of bounding box.By extending output dimensions,the trained model is able to output the rectangular bounding boxes and the building vertices meanwhile.Finally,we evaluate the performance of the Correg-YOLOv3 on our self-produced data set and provide a comparative analysis qualitatively and quantitatively.The experimental results achieve high performance in precision(96.45%),recall rate(95.75%),F1 score(96.10%)and average precision(98.05%),which were 2.73%,5.4%,4.1%and 4.73%higher than that of YOLOv3.Therefore,our proposed algorithm effectively tackles the problem of dense building detection in high resolution images.
基金supported by the National Natural Science Foundation of China under Grant Nos.12063002,12163004,and 12073077。
文摘To address the problem of the low accuracy of transverse velocity field measurements for small targets in highresolution solar images,we proposed a novel velocity field measurement method for high-resolution solar images based on PWCNet.This method transforms the transverse velocity field measurements into an optical flow field prediction problem.We evaluated the performance of the proposed method using the Hαand TiO data sets obtained from New Vacuum Solar Telescope observations.The experimental results show that our method effectively predicts the optical flow of small targets in images compared with several typical machine-and deeplearning methods.On the Hαdata set,the proposed method improves the image structure similarity from 0.9182 to0.9587 and reduces the mean of residuals from 24.9931 to 15.2818;on the TiO data set,the proposed method improves the image structure similarity from 0.9289 to 0.9628 and reduces the mean of residuals from 25.9908 to17.0194.The optical flow predicted using the proposed method can provide accurate data for the atmospheric motion information of solar images.The code implementing the proposed method is available on https://github.com/lygmsy123/transverse-velocity-field-measurement.
基金partially supported by the Japan Society for the Promotion of Science(JSPS)KAKENHI(JP22H03643)Japan Science and Technology Agency(JST)Support for Pioneering Research Initiated by the Next Generation(SPRING)(JPMJSP2145)JST through the Establishment of University Fellowships towards the Creation of Science Technology Innovation(JPMJFS2115)。
文摘Dear Editor,This letter proposes to integrate dendritic learnable network architecture with Vision Transformer to improve the accuracy of image recognition.In this study,based on the theory of dendritic neurons in neuroscience,we design a network that is more practical for engineering to classify visual features.Based on this,we propose a dendritic learning-incorporated vision Transformer(DVT),which out-performs other state-of-the-art methods on three image recognition benchmarks.
文摘A novel image fusion network framework with an autonomous encoder and decoder is suggested to increase thevisual impression of fused images by improving the quality of infrared and visible light picture fusion. The networkcomprises an encoder module, fusion layer, decoder module, and edge improvementmodule. The encoder moduleutilizes an enhanced Inception module for shallow feature extraction, then combines Res2Net and Transformerto achieve deep-level co-extraction of local and global features from the original picture. An edge enhancementmodule (EEM) is created to extract significant edge features. A modal maximum difference fusion strategy isintroduced to enhance the adaptive representation of information in various regions of the source image, therebyenhancing the contrast of the fused image. The encoder and the EEM module extract features, which are thencombined in the fusion layer to create a fused picture using the decoder. Three datasets were chosen to test thealgorithmproposed in this paper. The results of the experiments demonstrate that the network effectively preservesbackground and detail information in both infrared and visible images, yielding superior outcomes in subjectiveand objective evaluations.
基金supported by the National Natural Science Foundation of China(Grant Nos.42322408,42188101,41974211,and 42074202)the Key Research Program of Frontier Sciences,Chinese Academy of Sciences(Grant No.QYZDJ-SSW-JSC028)+1 种基金the Strategic Priority Program on Space Science,Chinese Academy of Sciences(Grant Nos.XDA15052500,XDA15350201,and XDA15014800)supported by the Youth Innovation Promotion Association of the Chinese Academy of Sciences(Grant No.Y202045)。
文摘Astronomical imaging technologies are basic tools for the exploration of the universe,providing basic data for the research of astronomy and space physics.The Soft X-ray Imager(SXI)carried by the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)aims to capture two-dimensional(2-D)images of the Earth’s magnetosheath by using soft X-ray imaging.However,the observed 2-D images are affected by many noise factors,destroying the contained information,which is not conducive to the subsequent reconstruction of the three-dimensional(3-D)structure of the magnetopause.The analysis of SXI-simulated observation images shows that such damage cannot be evaluated with traditional restoration models.This makes it difficult to establish the mapping relationship between SXIsimulated observation images and target images by using mathematical models.We propose an image restoration algorithm for SXIsimulated observation images that can recover large-scale structure information on the magnetosphere.The idea is to train a patch estimator by selecting noise–clean patch pairs with the same distribution through the Classification–Expectation Maximization algorithm to achieve the restoration estimation of the SXI-simulated observation image,whose mapping relationship with the target image is established by the patch estimator.The Classification–Expectation Maximization algorithm is used to select multiple patch clusters with the same distribution and then train different patch estimators so as to improve the accuracy of the estimator.Experimental results showed that our image restoration algorithm is superior to other classical image restoration algorithms in the SXI-simulated observation image restoration task,according to the peak signal-to-noise ratio and structural similarity.The restoration results of SXI-simulated observation images are used in the tangent fitting approach and the computed tomography approach toward magnetospheric reconstruction techniques,significantly improving the reconstruction results.Hence,the proposed technology may be feasible for processing SXI-simulated observation images.
基金funding and support from the United Kingdom Space Agency(UKSA)the European Space Agency(ESA)+5 种基金funded and supported through the ESA PRODEX schemefunded through PRODEX PEA 4000123238the Research Council of Norway grant 223252funded by Spanish MCIN/AEI/10.13039/501100011033 grant PID2019-107061GB-C61funding and support from the Chinese Academy of Sciences(CAS)funding and support from the National Aeronautics and Space Administration(NASA)。
文摘The Soft X-ray Imager(SXI)is part of the scientific payload of the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.SMILE is a joint science mission between the European Space Agency(ESA)and the Chinese Academy of Sciences(CAS)and is due for launch in 2025.SXI is a compact X-ray telescope with a wide field-of-view(FOV)capable of encompassing large portions of Earth’s magnetosphere from the vantage point of the SMILE orbit.SXI is sensitive to the soft X-rays produced by the Solar Wind Charge eXchange(SWCX)process produced when heavy ions of solar wind origin interact with neutral particles in Earth’s exosphere.SWCX provides a mechanism for boundary detection within the magnetosphere,such as the position of Earth’s magnetopause,because the solar wind heavy ions have a very low density in regions of closed magnetic field lines.The sensitivity of the SXI is such that it can potentially track movements of the magnetopause on timescales of a few minutes and the orbit of SMILE will enable such movements to be tracked for segments lasting many hours.SXI is led by the University of Leicester in the United Kingdom(UK)with collaborating organisations on hardware,software and science support within the UK,Europe,China and the United States.
基金supported by the Research Council of Norway under contracts 223252/F50 and 300844/F50the Trond Mohn Foundation。
文摘Global images of auroras obtained by cameras on spacecraft are a key tool for studying the near-Earth environment.However,the cameras are sensitive not only to auroral emissions produced by precipitating particles,but also to dayglow emissions produced by photoelectrons induced by sunlight.Nightglow emissions and scattered sunlight can contribute to the background signal.To fully utilize such images in space science,background contamination must be removed to isolate the auroral signal.Here we outline a data-driven approach to modeling the background intensity in multiple images by formulating linear inverse problems based on B-splines and spherical harmonics.The approach is robust,flexible,and iteratively deselects outliers,such as auroral emissions.The final model is smooth across the terminator and accounts for slow temporal variations and large-scale asymmetries in the dayglow.We demonstrate the model by using the three far ultraviolet cameras on the Imager for Magnetopause-to-Aurora Global Exploration(IMAGE)mission.The method can be applied to historical missions and is relevant for upcoming missions,such as the Solar wind Magnetosphere Ionosphere Link Explorer(SMILE)mission.
基金supported by the National Natural Science Foundation of China(62375144 and 61875092)Tianjin Foundation of Natural Science(21JCYBJC00260)Beijing-Tianjin-Hebei Basic Research Cooperation Special Program(19JCZDJC65300).
文摘Limited by the dynamic range of the detector,saturation artifacts usually occur in optical coherence tomography(OCT)imaging for high scattering media.The available methods are difficult to remove saturation artifacts and restore texture completely in OCT images.We proposed a deep learning-based inpainting method of saturation artifacts in this paper.The generation mechanism of saturation artifacts was analyzed,and experimental and simulated datasets were built based on the mechanism.Enhanced super-resolution generative adversarial networks were trained by the clear–saturated phantom image pairs.The perfect reconstructed results of experimental zebrafish and thyroid OCT images proved its feasibility,strong generalization,and robustness.
文摘Throughout the SMILE mission the satellite will be bombarded by radiation which gradually damages the focal plane devices and degrades their performance.In order to understand the changes of the CCD370s within the soft X-ray Imager,an initial characterisation of the devices has been carried out to give a baseline performance level.Three CCDs have been characterised,the two flight devices and the flight spa re.This has been carried out at the Open University in a bespo ke cleanroom measure ment facility.The results show that there is a cluster of bright pixels in the flight spa re which increases in size with tempe rature.However at the nominal ope rating tempe rature(-120℃) it is within the procure ment specifications.Overall,the devices meet the specifications when ope rating at -120℃ in 6 × 6 binned frame transfer science mode.The se rial charge transfer inefficiency degrades with temperature in full frame mode.However any charge losses are recovered when binning/frame transfer is implemented.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
基金financially supported by the National Key Research and Development Program(Grant No.2022YFE0107000)the General Projects of the National Natural Science Foundation of China(Grant No.52171259)the High-Tech Ship Research Project of the Ministry of Industry and Information Technology(Grant No.[2021]342)。
文摘Identification of the ice channel is the basic technology for developing intelligent ships in ice-covered waters,which is important to ensure the safety and economy of navigation.In the Arctic,merchant ships with low ice class often navigate in channels opened up by icebreakers.Navigation in the ice channel often depends on good maneuverability skills and abundant experience from the captain to a large extent.The ship may get stuck if steered into ice fields off the channel.Under this circumstance,it is very important to study how to identify the boundary lines of ice channels with a reliable method.In this paper,a two-staged ice channel identification method is developed based on image segmentation and corner point regression.The first stage employs the image segmentation method to extract channel regions.In the second stage,an intelligent corner regression network is proposed to extract the channel boundary lines from the channel region.A non-intelligent angle-based filtering and clustering method is proposed and compared with corner point regression network.The training and evaluation of the segmentation method and corner regression network are carried out on the synthetic and real ice channel dataset.The evaluation results show that the accuracy of the method using the corner point regression network in the second stage is achieved as high as 73.33%on the synthetic ice channel dataset and 70.66%on the real ice channel dataset,and the processing speed can reach up to 14.58frames per second.
基金the China Postdoctoral Science Foundation under Grant 2021M701838the Natural Science Foundation of Hainan Province of China under Grants 621MS042 and 622MS067the Hainan Medical University Teaching Achievement Award Cultivation under Grant HYjcpx202209.
文摘Watermarks can provide reliable and secure copyright protection for optical coherence tomography(OCT)fundus images.The effective image segmentation is helpful for promoting OCT image watermarking.However,OCT images have a large amount of low-quality data,which seriously affects the performance of segmentationmethods.Therefore,this paper proposes an effective segmentation method for OCT fundus image watermarking using a rough convolutional neural network(RCNN).First,the rough-set-based feature discretization module is designed to preprocess the input data.Second,a dual attention mechanism for feature channels and spatial regions in the CNN is added to enable the model to adaptively select important information for fusion.Finally,the refinement module for enhancing the extraction power of multi-scale information is added to improve the edge accuracy in segmentation.RCNN is compared with CE-Net and MultiResUNet on 83 gold standard 3D retinal OCT data samples.The average dice similarly coefficient(DSC)obtained by RCNN is 6%higher than that of CE-Net.The average 95 percent Hausdorff distance(95HD)and average symmetric surface distance(ASD)obtained by RCNN are 32.4%and 33.3%lower than those of MultiResUNet,respectively.We also evaluate the effect of feature discretization,as well as analyze the initial learning rate of RCNN and conduct ablation experiments with the four different models.The experimental results indicate that our method can improve the segmentation accuracy of OCT fundus images,providing strong support for its application in medical image watermarking.
基金the TCL Science and Technology Innovation Fundthe Youth Science and Technology Talent Promotion Project of Jiangsu Association for Science and Technology,Grant/Award Number:JSTJ‐2023‐017+4 种基金Shenzhen Municipal Science and Technology Innovation Council,Grant/Award Number:JSGG20220831105002004National Natural Science Foundation of China,Grant/Award Number:62201468Postdoctoral Research Foundation of China,Grant/Award Number:2022M722599the Fundamental Research Funds for the Central Universities,Grant/Award Number:D5000210966the Guangdong Basic and Applied Basic Research Foundation,Grant/Award Number:2021A1515110079。
文摘Convolutional neural networks depend on deep network architectures to extract accurate information for image super‐resolution.However,obtained information of these con-volutional neural networks cannot completely express predicted high‐quality images for complex scenes.A dynamic network for image super‐resolution(DSRNet)is presented,which contains a residual enhancement block,wide enhancement block,feature refine-ment block and construction block.The residual enhancement block is composed of a residual enhanced architecture to facilitate hierarchical features for image super‐resolution.To enhance robustness of obtained super‐resolution model for complex scenes,a wide enhancement block achieves a dynamic architecture to learn more robust information to enhance applicability of an obtained super‐resolution model for varying scenes.To prevent interference of components in a wide enhancement block,a refine-ment block utilises a stacked architecture to accurately learn obtained features.Also,a residual learning operation is embedded in the refinement block to prevent long‐term dependency problem.Finally,a construction block is responsible for reconstructing high‐quality images.Designed heterogeneous architecture can not only facilitate richer structural information,but also be lightweight,which is suitable for mobile digital devices.Experimental results show that our method is more competitive in terms of performance,recovering time of image super‐resolution and complexity.The code of DSRNet can be obtained at https://github.com/hellloxiaotian/DSRNet.
文摘To address the issues of incomplete information,blurred details,loss of details,and insufficient contrast in infrared and visible image fusion,an image fusion algorithm based on a convolutional autoencoder is proposed.The region attention module is meant to extract the background feature map based on the distinct properties of the background feature map and the detail feature map.A multi-scale convolution attention module is suggested to enhance the communication of feature information.At the same time,the feature transformation module is introduced to learn more robust feature representations,aiming to preserve the integrity of image information.This study uses three available datasets from TNO,FLIR,and NIR to perform thorough quantitative and qualitative trials with five additional algorithms.The methods are assessed based on four indicators:information entropy(EN),standard deviation(SD),spatial frequency(SF),and average gradient(AG).Object detection experiments were done on the M3FD dataset to further verify the algorithm’s performance in comparison with five other algorithms.The algorithm’s accuracy was evaluated using the mean average precision at a threshold of 0.5(mAP@0.5)index.Comprehensive experimental findings show that CAEFusion performs well in subjective visual and objective evaluation criteria and has promising potential in downstream object detection tasks.